tsunami

log in
history

Advances in the .NET Type System

Luke Breuer
2009-11-24 19:33 UTC

introduction
PDC 2008: Under the Hood Advances in the .NET Type System
agenda
  • challenges in enabling extensibility
  • solution, part I — type embedding
  • solution, part II — type equivalence
  • putting it all together: loose type coupling and extensibility
  • appendix: improvements in event handling for COM objects
challenges
deployment of PIAs
  • user code might take 40 KB
  • usage of Microsoft.Office.Interop.Excel.dll would require a 1.2 MB file
  • one's working set is unnecessarily bloated
  • cannot copy above assembly to user machine and GAC it; one needs to install Office 2007 PIA Redistributable (6.3 MB)
  • to target more than one version of MS Office
    • one has to include all the PIAs
    • the same C# code cannot target both versions transparently — each must be explicitly coded for (probably with proxies)
managed-to-managed
There is tight type-coupling, such that one can only bind against a specific version of an assembly. Thus, multi-versioning is hard.
NoPIA
The PIA problem is now solvable via the NoPIA feature, where the methods, interfaces, etc. used are compiled directly into one's assembly. Advanced developers were able to achieve the same effect, but at a decent amount of effort.

When ildasming an assembly that has been NoPIAed, one of the things you will see is methods called something like _VtblGap1_3. This is required, as our CLR interfaces must match the COM interface. The NoPIA feature removes unused methods, but their slots need to be retained. The last number (in this case, 3) indicates to the compiler how many slots the CLR should skip.

A question: how does NoPIA work with C#'s dynamic?
type embedding rules
  • only metadta is locally embedded
    • interfaces (must have ComImport, Guid attributes)
    • delegates
    • simple structs
    • enums
    • but not classes or static methods
  • only types from interop assemblies can be embedded; compilers check for these attributes
    • [assembly:Guid(...)]
    • [assembly:ImportedFromTypeLib(...)]

IL cannot be embedded:
  1. we don't want bugs to be embeddable; if IL were embeddable, there is effectively no way to fix these bugs
  2. it is just hard to do
  3. compilers create local partial types (not to be confused with partial classes)
    • local types are marked with TypeIdentifierAttribute
    • using local types leads to reduced memory footprint
  4. compilers track "used" methods of the canonical interface and only add those methods to the local interface definition
    • _vTblGap pseudo methods are emitted in place of unused methods to maintain vtable compatibility
canonical definition
[ComImport]
[Guid(“E09335AA-9623-407b-AF63-5767CC6B7730”)]
interface IFoo {
    void Method1(IBar bar);
    void Method2();
    void Method3();
    void Method4();
    IBar Method5();
    void Method6();
    void Method7();
    void Method8();
    void Method9();
    void Method10();
    void Method11();
    void DoWork(void);
    void Method13();
    void Method14();
};
embedded partial local type
[ComImport]
[Guid("E09335AA-9623-407b-AF63-5767CC6B7730")]{
    void _VtblGap1_11();   // Skip 11 v-table slots preceding DoWork 
    void DoWork();
    void _VtblGap2_2();    // Skip 2 v-table slots following DoWork
}
instantiation of COM objects
  • legacy mode
    • interop assembly (IA) contains classes with ComImport, Guid attributes; the CLR intercepts instantiation of such classes and calls COM's CoCreateInstance
  • NoPIA
    • problem: classes cannot be embedded
    • solution: compiler analyzes the IA, finds the correct GUID, and emits the following call:
      Activator.CreateInstance(Type.GetTypeFromCLSID(guid))
events
The only IL in PIAs is for events.
  • legacy mode
    • registering an event on a COM object is intercepted at runtime by the CLR and forwarded to helper classes embedded in the IA
  • NoPIA
    • compilers recognize when an evet handler is added/removed and emit a call to a new generic COM event handler
    • COM objects must use late binding to raise events (they usually do)
working with multiple assemblies
  • typical applications use helper libraries
  • helper libraries also need to embed types
  • number of separate copies of the same interop type are created
    • these are all different types
    • can we still use a method returning a different copy of a type?

The CLR and compilers are able to unify these types via the combination of these attributes:
  • [TypeIdentifier]
  • [Guid(...)]
caveats
Library assemblies must be:
  • compatible with the version host
  • NoPIA-enabled
managed-to-managed
Interfaces can be embedded into add-ins. Type equivalence allows add-ins to run with differently-versioned host code. (Whether they will actually work depends on how much has changed and upon what they depend.)
type equivalence
  • CLR 4.0 feature
  • interfaces with the same GUID are treated by CLR as equivalent types
  • casts to an equivalent interface
    • CLR looks for TypeIdentifier attribute to be present on one of the interfaces
  • calls through an equivalent interface
  • COM objects: CLR intercepts the calls and routes them through COM interop (this is the old behavior)
  • managed objects: CLR finds an equivalent interface in the inheritance chain, looks up a method with the same vtable offset, and verifies the signatures match
    • if the signatures do not match, an exception is thrown
    • if the method is not call, System.MethodMissingException is thrown
type safety
  • it is possible to construct an interface that is type equivalent to another interface, but which is completely incompatible with that interface
  • casts to such an interface will succeed
  • incompatible calls on such interface will fail: CLR ensures method signatures are compatible, so an attacker cannot construct an illegal call
  • FullTrust is required for using type equivalence with structs
type equivalence
In summary, the CLR introduces type equivalence support:
  • type-safe
  • multi-targeting through interfaces
  • foundation for creating loosely coupled extensible applications
Q&A
  • exploitation of type equivalence
    • Add-ins could destructively define interfaces to allow for exploitation. While type-equivalent structs require FullTrust, interfaces do not in the 2008 CTP. This is still being debated at MS and may change in RTM.
  • vtable matching vs. name matching
    • required for COM
    • matching by name could work with managed code, but vtable matching is currently used