C# · CodeProject · Dotnet

Internals of SizeOf operator

Hi Reader,

In this post i shall be talking about sizeof operators in details including its internals. Although at this point of time you might think its very simple keyword which compiler uses to known about a Type’s size. Yes true, but keep reading.

Lets debunk this in detail by supplying various types to sizeof operator and see what happens at IL in other words what compiler translate it into.

So with the below code:

var result = sizeof(int);
result = sizeof(char);
result = sizeof(double);
result = sizeof(float);

Its pretty clear for us that each size of the type specified to the operator sizeof is deciphered by the compiler as per this IL shown:

IL_0000: nop
IL_0001: ldc.i4.4 //sizeof(int)
IL_0002: stloc.0
IL_0003: ldc.i4.2 //sizeof(char)
IL_0004: stloc.0
IL_0005: ldc.i4.8 //sizeof(double)
IL_0006: stloc.0
IL_0007: ldc.i4.4 //sizeof(float)
IL_0008: stloc.0

So from the above IL, it shows that the compiler is pretty obvious about the size consumed by these valuetypes and hence it directly embeds integer values. Hey that’s how the compiler has been designed, you say. Yes there is nothing special about it.

But lets not forget our other eligible ValueType candidates viz. Struct and Enum.

Now lets examine the reaction of compiler w.r.t enum and struct

 struct MyStruct { }

enum MyEnum { }

Now i can not call directly sizeof(MyStruct) because the compiler does not know any in-depth details about the struct here nor its memory consumed even though it has an overview of the struct contents.

But on the flip size if i do sizeof(MyEnum) then it purely accepts, because every thing Enum represents internally is only 4 byte integer no matter what ever its content is.

Lets come back to the struct type again. Now when i do sizeof(MyStruct) the compiler gives out an error saying:

Error    1   ‘MyStruct’ does not have a predefined size, therefore sizeof can only be used in an unsafe context (consider using System.Runtime.InteropServices.Marshal.SizeOf)

I am sure you might have encountered this many times. Still,the compiler can not take a chance in deciphering any memory consumption on this value type, hence its asking developer to use unsafe or Marshal.SizeOf() method.

If i specify unsafe block over sizeof(MyStruct), then the compiler accepts which means to say that compiler is trusting you and the run time now and delegating memory consumption calculation to run time, which is pretty obvious is an unmanaged and unsafe way of doing things. The reason is pretty obvious that you can add any thing inside a struct. So at run time only you actually can get the size consumption.

Now as per the error message says, lets push the unsafe way of calculating sizeof over struct and see what compiler informs run time in IL, by doing so i get below IL:

IL_0002: sizeof ConsoleApplication2.Program/MyStruct

Now as you can see, the sizeof has become an OpCode for CLR here. Compiler no more uses its own operator code but rather changes to OpCode instruction. So now its pretty much to the CLR to take overhead in calculating its size.

Now there is an exception for this unsafe way of knowing struct size. As long as the struct has valuetypes defined in it as shown:

struct MyOtherStruct {}

struct MyStruct { MyOtherStruct mos; int a; } //Displays 4 bytes

Apparently with out int defined inside MyStruct, it consumes 1 byte of memory.  Now if you defined a char inside MyStruct in place of int, then you still get 4 bytes. This is quite strange as why 1 byte is now taking 2 bytes except for char which is 2 bytes. Please leave a comment if you know. Yes have tried looking at HeapDump or StackDump as well. Got turned down.

Let me come back to exception for unsafe{} block over struct size. As long as you have valuetype defined inside struct, compiler is happy to compile. But as soon as you defined a ref type inside the above struct as shown:

struct MyStruct { string str;}

Now the compiler is unsure to even let me do a sizeof() over MyStruct inside unsafe block. This is quite interesting to me as well.

Let me come back to the error message earlier compiler issued. It suggested me to use Marshal.SizeOf() method if i do not want to use unsafe block. Now if i use SizeOf() API over the same struct having just int defined in it, I do get 4 bytes and 1 byte if MyStruct has nothing defined in it. Which is perfectly correct and even same result as per sizeof() operator.

But there is a bit catch here, if i use reference type inside MyStruct as shown:

struct MyStruct
{
string[] array;

public MyStruct(string str)
{
this.array = new string[100];
}
}

As per the IL shown:

IL_0001: ldarg.0
IL_0002: ldc.i4.s 100
IL_0004: newarr [mscorlib]System.String

its pretty much telling run time to create 100 item sized array. But upon calling Marshal.SizeOf() over the above MyStruct, I still get 4 bytes. Not only this is quite interesting it’s really confusing.

Thanks and please leave a comment for my learning 🙂

Advertisements

2 thoughts on “Internals of SizeOf operator

  1. First of all, it is a good post and something I missed out to explain in my Internal series.

    Well, let me explain.

    1.It doesnot calculate the size of known types (Its a compiler trick, replaces sizeof with constants introduced in .NET 2.0 which ensures unsafe is not required for them)

    2.Enum is Int by default. Hence if you do

    enum MyEnum : byte
    {
    }

    you would have resulted differently (1).

    3. Marshall.Sizeof even any sizeof operator always runs on Type rather than the actual instance. The Type reference is created when you create your first object in the process. Marshall.Sizeof ensures that the Type information is present before calling the Sizeof from it. So when you call it it will count individual reference type by its pointer rather than the actual size from heap. For ValueTypes which are known like int, float, char etc It counts it automatically. Array being a pointer is actually holds a managed IntPtr reference which is 4byte long.
    Your array is actually a reference to System.Array.

    Hence you see this result.

    I hope it is clear to you now.

  2. Hey, Thanks alot abhi 🙂

    1. Yes infact i showed that via IL in the post :). Damn compiler is a weirdo 😉
    2. Yep i figured it out when i declared an empty Enum and checked the IL
    3. Hmm.. True its a reference and the reference size is just 4 bytes on 32 bit machines. But how would you explain in case of Structure having another structure as member along with an valuetype. Like i said in the post having an char and a struct as member, the sizeof returns 4 bytes which is surprising. But with out a struct as member in it, it returns 2.

    Some where i guess M$ compiler team has done alot of circus and made it more complex just to accommodate the features. Don’t you think?

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s