Also available at

Also available at my website http://tosh.me/ and on Twitter @toshafanasiev

Wednesday, 8 February 2012

C++/CLI literal keyword bug

In C# you have two similar but importantly different options for defining constants; const and static readonly, as shown:

// names.cs
// compile with csc /t:library names.cs
public static class Names {
  public const string First = "Tosh";
  public static readonly string Last  = "Afanasiev";
}

The compiler generates the following IL for this code:

.assembly names

// lots of guff omitted

.class public abstract auto ansi sealed beforefieldinit Names
       extends [mscorlib]System.Object
{
  .field public static literal string First = "Tosh"
  .field public static initonly string Last
  .method private hidebysig specialname rtspecialname static 
          void  .cctor() cil managed
  {
    // Code size       11 (0xb)
    .maxstack  8
    IL_0000:  ldstr      "Afanasiev"
    IL_0005:  stsfld     string Names::Last
    IL_000a:  ret
  } // end of method Names::.cctor

} // end of class Names

Both Names.First and Names.Last are constants in the sense that you can consume but not modify their values but the way they are bound is the crucial difference. Notice how the First symbol is bound to its value in metadata - it is used as an alias for the literal (hence the literal flag) string 'Tosh'; while the Last symbol is declared but assigned to in the type initialiser (or static constructor, if you prefer), thereby making its value only available once the Names class has been loaded in the current app domain. Hence the value of First is knowable at compile time which the value of Last can only be known at run time.

This difference is most clearly illustrated by examining the code that consumes these values; here are two C# programs and the IL generated for them:


// firstname.cs
// compile with csc /r:names.dll firstname.cs
using System;

static class prog {
  static void Main() {
    Console.WriteLine( "First name: {0}", Names.First );
  }
}


.assembly firstname

// guff

.class private abstract auto ansi sealed beforefieldinit prog
       extends [mscorlib]System.Object
{
  .method private hidebysig static void  Main() cil managed
  {
    .entrypoint
    // Code size       18 (0x12)
    .maxstack  8
    IL_0000:  nop
    IL_0001:  ldstr      "First name: {0}"
    IL_0006:  ldstr      "Tosh"
    IL_000b:  call       void [mscorlib]System.Console::WriteLine(string,
                                                                  object)
    IL_0010:  nop
    IL_0011:  ret
  } // end of method prog::Main

} // end of class prog


// lastname.cs
// compile with csc /r:names.dll lastname.cs
using System;

static class prog {
  static void Main() {
    Console.WriteLine( "Last name: {0}", Names.Last );
  }
}


.assembly extern names
{
  .ver 0:0:0:0
}
.assembly lastname

// guff

.class private abstract auto ansi sealed beforefieldinit prog
       extends [mscorlib]System.Object
{
  .method private hidebysig static void  Main() cil managed
  {
    .entrypoint
    // Code size       18 (0x12)
    .maxstack  8
    IL_0000:  nop
    IL_0001:  ldstr      "Last name: {0}"
    IL_0006:  ldsfld     string [names]Names::Last
    IL_000b:  call       void [mscorlib]System.Console::WriteLine(string,
                                                                  object)
    IL_0010:  nop
    IL_0011:  ret
  } // end of method prog::Main

} // end of class prog

The difference that immediately jumps out is that the firstname program's code contains a copy of the literal value Names.First, as defined in the metadata of that assembly; while lastname loads the static field Names.Last in order to evaluate the constant. A less obvious difference is that lastname declares a dependency on the names assembly while firstname does not.

So what are the implications of this? Firstly, since Names.Last is evaluated at runtime by loading another assembly and accessing a static field, the value can be changed after deployment without recompiling and redistributing the consuming code - great when you have numerous clients, possibly not your own, that need to be kept up to date. Assemblies consuming the Names.First value would be totally unaware any change to this value unless they were themselves recompiled.

The other side of this coin is that the consuming assembly has a runtime dependency on names.dll (firstname will run fine on its own while lastname fails miserably with names.dll out of reach) and, more subtly, the use of the constant Names.Last is limited to runtime contexts. This last point may not be a problem but consider the case where attribute values are centralised, the following would result in a compile error:


[FicticiousAttribute(Names.Last)]
public class ...

While this would not:

[FicticiousAttribute(Names.First)]
public class ...

So, when choosing how to declare a shared constant you need to think about how shared and how constant it actually is.

Now for the bug.

The keywords in C++/CLI resemble their IL counterparts more closely than in C# - literal is used to denote compile time constants while static initonly is used for those dynamically evaluated runtime constants (obviously they couldn't use const).

Here's where I noticed a bug in the VS2005 and VS2008 C++/CLI compilers (not VS2010 though). If you define a literal (i.e. metadata based, class level compile time constant ) value on an abstract sealed class (that's a static class in C#) you get a compile error:


// values.cpp
// compile with cl /clr values.cpp
using namespace System;

  public ref class Names abstract sealed {
  public:
    static initonly String^ First = "Tosh";
    literal String^ Last = "Afanasiev";
  };

int main() {
  Console::WriteLine( "Hi, my name is {0} {1}!", Names::First, Names::Last );

  return 0;
}

VS2005:

C:\code>cl /clr values.cpp
Microsoft (R) C/C++ Optimizing Compiler Version 14.00.50727.762
for Microsoft (R) .NET Framework version 2.00.50727.5420
Copyright (C) Microsoft Corporation.  All rights reserved.

values.cpp
values.cpp(9) : error C4693: 'Names': a sealed abstract class cannot have any in
stance members 'Last'

VS2008:

C:\code>cl /clr values.cpp
Microsoft (R) C/C++ Optimizing Compiler Version 15.00.30729.01
for Microsoft (R) .NET Framework version 2.00.50727.5420
Copyright (C) Microsoft Corporation.  All rights reserved.

values.cpp
values.cpp(9) : error C4693: 'Names': a sealed abstract class cannot have any in
stance members 'Last'

VS2010:

C:\code>cl /clr values.cpp
Microsoft (R) C/C++ Optimizing Compiler Version 16.00.30319.01
for Microsoft (R) .NET Framework version 4.00.30319.1
Copyright (C) Microsoft Corporation.  All rights reserved.

values.cpp
Microsoft (R) Incremental Linker Version 10.00.30319.01
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:values.exe
values.obj

C:\code>values
Hi, my name is Tosh Afanasiev!

The compiler complains of instance members on an abstract sealed class when it's not an instance member we're adding.

To get around this you have to suppress error 4693. I'd suggest making the scope of this as tight as possible and clearly documenting your reasons, here's an example:


// values.cpp
// compile with cl /clr values.cpp
using namespace System;

  public ref class Names abstract sealed {
  public:
    static initonly String^ First = "Tosh";
// disabling error for VS2005 and VS2008 compilers
// search http://blog.tosh.me/ for 'literal keyword bug' for details
#pragma warning( push )
#pragma warning( disable: 4693 )
    literal String^ Last = "Afanasiev";
#pragma warning( pop )
  };

int main() {
  Console::WriteLine( "Hi, my name is {0} {1}!", Names::First, Names::Last );

  return 0;
}

There you go, happy constants all round.