Copyright © Microsoft Corporation. This document is an archived reproduction of a version originally published by Microsoft. It may have slight formatting modifications for consistency and to improve readability. |
Avoiding DLL Hell: Introducing Application Metadata in the Microsoft .NET Framework |
|||||||||||||||||||||||||||||
Matt Pietrek |
|||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||
In this article I'll assume that you're somewhat familiar with the .NET platform. If not, a great introductory overview can be found in Jeffrey Richter's article, "Microsoft .NET Framework Delivers the Platform for an Integrated, Service-Oriented Web" in the September 2000 issue of MSDN® Magazine, and part 2 in this issue. For this article, the important thing to remember is that a .NET assembly is the fundamental building block of program components. Assemblies can be as simple as a single DLL, or can comprise numerous DLLs and resources in multiple files. No matter what an assembly looks like, its metadata is the key to unlocking the goodies contained within. Please note that the information contained in this article is based on prerelease software, and some features may be different when the final release ships. Why Do You Need Metadata?Metadata is the glue that enables many of the features of the Microsoft .NET Framework, and is an extremely rich description of the pieces in an assembly. It contains details such as type descriptions and information about any external assemblies that the current assembly uses. Metadata also contains version information, describes the resources that are in the assembly, and enables other assorted .NET features.Consider a type definition such as a C++ class. The metadata for that type would completely describe the class, including the methods and their parameters, their calling conventions, the class's data members, and the visibility of all class members. In Visual Basic®, these concepts would extend to the events the class can fire. Metadata is intended to be the union of all such attributes exposed by any language. If you've programmed in the Java language, you might notice that .class files expose much of the same information as metadata. With such a complete description of program components, the .NET common language runtime postpones much of the work of the traditional linker until the program executes. In exchange for causing some additional work at program load-time, the metadata enables many of the benefits of .NET, including side-by-side execution and code access security. One of the biggest wins of .NET is the greatly improved ability to easily use code written in different languages. For instance, .NET makes it trivial to call routines in the Visual Basic runtime from Visual C++® (assuming you're using the .NET-enabled compilers). In order to do this, languages that target .NET need to share a common format for exposing the information describing a type. Metadata is that common format. Yet another benefit of using metadata is that languages that target .NET can dispense with language-specific mechanisms for importing knowledge of external components. For example, suppose you want to call a function in an external DLL using C++. In the old way of doing things, you needed to #include the appropriate header files in your source code and remember to include the appropriate import library in your linker options. To do the same thing the old way in Visual Basic, you'd use a Declare statement, and hope that you specified the parameters correctly, since the compiler can't verify them. (Bonus hassle: Do you need to use ByRef or ByVal for any given parameter?) In the new .NET world, any .NET compiler or scripting language can suck in the metadata and gain access to the same level of information. Different languages will use different syntaxes to import metadata, but they'll all end up with the same information. For example, to import metadata for an assembly in C++, the #using directive is the ticket. For instance,
If you've used Visual C++ 6.0 to do traditional COM programming, you may have used the #import directive to read in a type library and create a set of ATL wrapper classes in .TLH and .TLI files. The compiler automatically reads these generated files to get an ATL wrapped view of the COM interfaces described in the type library. While this was a novel ideal, it was clunky and forced you into using ATL. In contrast, using #import in the .NET compiler is extremely natural. It's conceptually similar to the ease of use of References in Visual Basic projects, but much more powerful. Metadata is also crucial for .NET to interoperate with existing Win32® APIs and COM servers. When managed code calls an unmanaged API such as CreateProcess or a COM interface, it uses the Platform Invoke (PInvoke) mechanism. The .NET common language runtime needs to undertake special preparation for PInvoke calls. This often includes parameter marshaling, such as converting a String * into the LPSTR or LPWSTR that the unmanaged code expects. Every function and method callable by managed code must have metadata for it, regardless of whether it's a managed or PInvoke call. A bit in the metadata for a function tells the runtime engine what type of call is required. In addition to providing a consistent way to import data about external components, metadata also creates a common way to describe which assemblies a given component relies on. In some languages such as C++, you can look at an executable's import section to see the DLLs and functions it imports. However, other languages, such as Visual Basic, don't import functions the same way. In addition, a Visual Basic-based EXE or DLL contains a list of what COM controls it may load at runtime, but the format of this information is specific to Visual Basic and is not documented. In either case, the resulting EXEs and DLLs contain no information about which specific versions of the imported DLLs or OCXs the program was linked against. Until now, there has been no consistent, reliable, and most importantly, common method for a component to express exactly which other components it relied on. This is one of the reasons why innocent people end up in DLL Hell when a shared component is updated or deleted. .NET, with its use of assembly versioning via metadata, will hopefully put these problems to rest. Metadata and Load-time LinkingIn a traditional, non-.NET environment such as Visual C++ 6.0 or Visual Basic 6.0, the linker (or similar code) is responsible for details like connecting a call to a method with the method's implementation. For instance, consider a function in A.CPP:
While I used C++ in this example, the same basic rules apply to other compiled languages, such as Visual Basic. As an example, when you use external COM objects in Visual Basic (like the Microsoft ActiveX® Data Objects), you need to select the appropriate file in the project References dialog. Under the hood, this causes Visual Basic to read in the type library information for the selected objects, and to become aware of how to call these objects, what events they export, and so forth. What about scripting languages? The same basic rules apply there as well. The calling code has to somehow be connected with the implementing function, and the parameters used in the calling code must nominally match up with the implementing code. However, because these systems are dynamically generated, mismatch errors or missing functions aren't found until you actually try to execute the call. How does a .NET program fit into this world? I think of it as a hybrid approach. While actual machine code isn't generated at runtime, the syntactic correctness of a program is verified at assembly link time. The Microsoft Intermediate Language (MSIL) code that .NET compilers create doesn't contain hardcoded references to specific instances, methods, or class data members. Instead, MSIL uses tokens to represent the connection between the caller and callee. These tokens are the magic cookies that the .NET loader uses to do things like locate where a particular class method is JITted in memory, or where a particular data member is located in a class instance. Metadata as an Evolution of IDLMy five-second synopsis of metadata is that it's IDL and type libraries on steroids. In the past, if you needed to invoke code in another process (or across a wire to another machine), you needed to supply fairly detailed information about the function, including the parameters, their types, and whether they were in or out parameters. This was necessary for the system to marshal the information between the two processes. The means by which you specified this information was IDL. The Microsoft program that processes IDL files is MIDL.EXE.Prototyping functions and interfaces in IDL can be frustrating. The MIDL compiler is notoriously finicky. In addition, you often end up duplicating the same information in whatever language you're using. Many programming language concepts don't translate to IDL easily or at all. Metadata frees you from these hassles. Simply declare your code and types in a .NET-compatible compiler, and the metadata will expose all the relevant information in a multilanguage manner. At some point, Microsoft realized that the information contained in IDL files was sufficiently detailed to support COM Automation. Automation is the ability of a program to learn about the methods and properties of an object at runtime. With this information, the program can invoke methods on the object, and read and write the object's properties. A key to COM Automation is a type library. The MIDL compiler reads an .IDL file and emits a type library (typically a .TLB file). A type library is essentially just a binary version of the human-readable .IDL files. Metadata is sufficiently detailed to allow the functional equivalent of COM Automation. That is, it provides the ability to drive another component's code at runtime. This capability is provided through the Reflection API, which I'll look at later. The Reflection API uses metadata in much the same way as COM Automation uses type libraries. Java language users may be aware that Java also has a Reflection API. Earlier, I mentioned that Visual Basic reads in type libraries to make the objects and interfaces they expose available to your code. A cool side effect of this is that the Visual Basic IDE has IntelliSense®. I've grown highly accustomed to letting the IDE pop up a list of usable methods or properties, tied to the specific object I'm working with. With metadata, this capability could be made available in any .NET language. Think about it: I could write a Visual Basic class, and by adding a simple "#using <xxx.dll>" line in my C++ code, I'd have IntelliSense for the Visual Basic-based component. Metadata, with its ease of importation and features such as IntelliSense, should make cross-language programming extremely simple for programmers like myself who are currently put off by the pain of hooking together DLLs written in different languages. Before .NET, connecting cross-language components was an onerous task. One approach was to call exported DLL functions (assuming your language supported exporting functions), and keep your parameters extremely simple. Alternatively, you could expose your code as COM interfaces, requiring you to immerse yourself in all the arcane rules of COM or ATLor both. Drilling into the MetadataHaving seen at a high level what metadata is, and what it enables, let's burrow down and see what it looks like. The first questions I had when I encountered metadata were "Where is it stored?" followed shortly by "What's the file format?" The short answer is that metadata is stored in the Portable Executable (PE) file as read-only data. For you PE file nerds, the metadata seems to always be merged into another of the read-only sections, usually .rdata or .text.In the PE header, using the
Before I get too far, it's important to mention that this article won't cover emitting metadata. Just as there are two APIs that read the metadata, there are two APIs for writing metadata. However, these APIs are intended for hardcore compiler writers and tools vendors. If you need to emit metadata, you'll find lots of gory details in the Microsoft documentation, so go wild! Here I'm focused on the much simpler details of reading existing metadata from an assembly. The Metadata HierarchyFigure 1 shows a simplified layout of the information contained in metadata. Starting at the root of the metadata hierarchy, you'll find the assembly. An assembly is the starting point from which metadata is imported. Each assembly contains one or more modules, although most assemblies have just one module. Each module contains a set of zero or more types. A module can also contain global methods. If you compile regular C++ functions that aren't members of any class whatsoever, they will appear in the metadata as global methods. The assembly metadata includes a unique GUID identifier, known as a Module Version ID (MVID). The MVID changes each time an assembly is rebuilt.
Skipping global methods for the moment, the next logical step down in the metadata hierarchy is types. A typical use of a type from C++, Visual Basic, or the Java language is to represent a class. Other types include enumerations and interfaces. A .NET interface is a logically related set of methods that is not bound to any single implementation. Essentially, they're the same thing as COM interfaces. If it helps, you can mentally substitute the term class for type in the description that follows. The Two Metadata APIsThe .NET common language runtime provides two different APIs for reading and writing metadata. As mentioned previously, this article only covers the read interfaces. The first of the APIs is a set of unmanaged COM interfaces, while the second API uses the .NET common language runtime and is called the Reflection API. For the sake of brevity, I'll refer to the unmanaged COM interfaces for reading metadata as the unmanaged API.No matter which API you use, you'll notice an immediate difference from the ITypeLib and ITypeInfo interfaces used to read COM type libraries. With COM, these interfaces served up most of the information in structures. You called the appropriate method, and got back a pointer to a structure with all the details filled in. The FUNCDESC structure in OAIDL.H is a great example of information exposed via ITypeInfo. It contains most of the information available about a given method. In contrast, the unmanaged metadata APIs rarely give you back a structure filled with information. Instead, you get the information in many small pieces. The unmanaged APIs usually take many parameters, with each parameter representing a single piece of information to be retrieved (for instance, the name of a property). The Reflection APIs are similar and use lots of methods and properties, each representing an individual piece of information. The unmanaged API is lower level and provides more information. It doesn't require the .NET common language runtime; it only requires that MSCORWKS.DLL be installed correctly on the system. It reads the metadata directly from the assembly file. The Reflection API uses the .NET common language runtime and is built atop the unmanaged API. Using the Unmanaged Metadata InterfacesFor most people, the most useful interface in the unmanaged API is IMetaDataImport. Alternatively, if you're interested in assembly-level details, there's IMetaDataAssemblyImport. Both are defined in COR.H. So, how do you get one of these puppies?The key to the metadata kingdom is another interface, IMetaDataDispenser. This is literally a dispenser for all types of metadata interfaces, regardless of whether you're reading or writing metadata, and whether you're working with metadata in a file or in memory. There's also an IMetaDataDispenserEx interface derived from IMetaDataDispenser, but its additional functionality isn't important for this article. Getting an IMetaDataDispenser is as easy as calling CoCreateInstance:
Most of the collections in the metadata (types, methods, fields, and so on) have an EnumXXX method that returns an array of tokens. Each token represents a single instance of something (a type, a method, a field, and so on). You can pass that token to the corresponding GetXXXProps method to obtain all the information about what the token represents. Figure 2 shows the IMetaDataImport token types, the method that enumerates each, and the method that gets the associated properties. Let's walk through an example to see how you'd use these IMetaDataImport methods to display the name of all the types and their members within an assembly. For each type, this example will show the names of all the methods and fields, as well as the parameter names of each method. In the steps that follow, all the methods are from the IMetaDataImport interface. To begin, call ::EnumTypeDefs. The first parameter is the address of an HCORENUM. It's not obvious, but you must initialize this value to zero before using it. Remember to pass the HCORENUM to ::CloseEnum when you're done with the enumeration. Upon return, EnumTypeDefs will have filled in an array of mdTypeDefs, and told you how many entries are in the array. You then iterate through the array, passing each mdTypeDef to ::GetTypeDefProps. Among other things this accomplishes, it returns the name of the TypeDef. You might notice that ::GetTypeDefProps takes a fair number of OUT parameters. If you're not interested in a particular OUT parameter's value, you can pass in 0. This comes in handy more often than you think. For instance, IMetaDataImport::GetPropertyProps takes 17 parameters, 14 of which are marekd as OUT. To enumerate the methods of a TypeDef, call ::EnumMethods. As before, an HCORENUM set to zero is required for input. Also required is an mdTypeDef. This tells ::EnumMethods which TypeDef it should return mdMethodDefs for. The output from ::EnumMethods is an array of mdMethodDefs. Each of these can be passed to ::GetMethodProps to get the name. The same pattern applies to the data fields. The only difference is that you use ::EnumFields and ::GetFieldProps, and that the tokens are mdFieldDefs. Finally, to enumerate the parameters of a method, use the ::EnumParams method. Like all other EnumXXX methods, it takes an HCORENUM. The second parameter is an mdMethodDef. This indicates which method you want the parameter for. Upon return, you'll have an array of mdParamDefs that can be passed to ::GetParamProps to get their name. I'll mention again that a method isn't obligated to have formal parameter data assigned to it. The method's type signature (described shortly) is the only guaranteed way of determining the number of parameters and their types. I learned the hard way how the ::EnumXXX methods return success or failure. If the enumeration succeeded and returned one or more tokens, the HRESULT is S_OK. If you screwed up the call, perhaps by passing bogus data, you'll get a reasonable error code. The part that tripped me up was when I called an enumeration method correctly, but there was no data to return. When this happens, the method returns 1. If you use the SUCCEEDED macro to test for success, be sure to take into account that the enumeration may have successfully returned zero items. In Figure 2 there are a few interesting tokens I haven't described. Perhaps the most interesting is the mdTypeRef. A TypeRef is conceptually like a TypeDef, the difference being that the type is defined in another module, either in the current assembly or in a different assembly altogether. In most cases, any place where an mdTypeDef is used, an mdTypeRef is also equally acceptable, although you can't have a TypeRef to a TypeDef in the same module. For every TypeRef token, a TypeDef exists in another module, either in the current assembly or a different assembly. How can you find out more about the type that an mdTypeRef refers to? Try the ::ResolveTypeRef method. If called successfully (and it will fail sometimes), you'll get back an IMetaDataImport for the assembly containing the type referred to. In addition, you'll get back the mdTypeDef for the type within that assembly. With the new IMetaDataImport, you can call ::GetTypeDefProps. You should think of ::ResolveTypeRef as a sort of shortcut for calling IMetaDataDispenser::OpenScope on the imported assembly. Metadata Type SignaturesWhen I first encountered metadata type signatures, I tried to avoid knowing too much about them. After all, they're not simple things like a human-readable string, an enum, or even a bitfield. Instead, type signatures are variable length blobs of data that require some pretty complex code to properly interpret. On the other hand, once you understand them, you can appreciate the complicated issues they're intended to solve. In this article, I'll simply call them type signatures, rather than the PCCOR_SIGNATURE name used in COR.H.Type signatures are all about describing types. Not the TypeDef sort of type, but types in the sense of int, single, pointer to struct Foo, and so forth. If you've worked with type descriptions in debug information or COM type libraries, you're probably all too familiar with the gyrations in describing user-defined types. Like most debug formats, metadata type signatures start with a base set of easily representable types defined by the CorElementType enum in CORHDR.H. Figure 3 lists these simple types and how they're represented in a type signature. Pretend that you've been assigned the job of encoding the parameters of a function call, along with its return value. You need it to be fast, small, and do the minimum amount of work. Niceties like parameter names or function names aren't necessary. Why would you want to do this? This is exactly the sort of thing a linker does to determine if a call to a function in one module matches the function's implementation in another. In this task, you'd want the following two functions to compare the same (that is, to have the same type signature):
If you stick to these simple types, the difference between the type signature schema I used previously and the one .NET uses is very simple. The only difference is that the .NET type signature includes an additional byte at the beginning that describes the calling convention of the function:
What about more complicated types? Perhaps instead of long parameter, you have a pointer to a long? Or, heaven forbid, an array of objects of type System.ComponentModel.Design.CurrencyEditor? The type signature encoding in .NET is equipped to handle more complicated types. Type modifiers form the basis for representing more complex types. A type modifier is a special token that's not one of the simple types. Instead, the type modifier is applied against the type token that immediately follows it in the type signature. The two most common type modifiers are pointers and arrays. Consider these two enum values from the CorElementType enum:
The IMetaDataAssemblyImport InterfaceIn addition to IMetaDataImport, the unmanaged API also has the IMetaDataAssemblyImport interface. This interface provides information about the assembly manifest, rather than the types contained within. The first method worth looking at is ::GetAssemblyProps, which returns all sorts of strings such as the assembly name and description. It can also fill in an ASSEMBLYMETADATA structure with data, including version and build numbers.Like IMetaDataImport, the IMetaDataAssemblyImport interface has a decent set of EnumXXX and GetXXXProps method pairs, as you can see in Figure 4. In addition, each of these pairings introduce yet another token type. I'll address this token madness in a subsequent section. Looking through the enumerations and tokens for IMetaDataAssemblyImport, some are pretty obvious, while others are more obscure. For instance, an mdAssemblyRef is simply a reference to another assembly that the current assembly depends on. Conceptually, you can think of it as a .NET version of an imported DLL. The mdManifestResource token is also a simple concept to grasp. Manifest resources represent data such as bitmaps, strings, and other items. They're contained in files or other assemblies, but they don't use the Win32 resource format. If a manifest resource is separated out into a separate .resource file, it's referenced by an mdFile token. The ::EnumFiles method gives you a list of all such files. In fact, the ::EnumFiles methods give you a list of all files and modules that make the current assembly. Metadata Tokens: Simpler than You ThinkIn running the sample program provided with this article (the Meta program, described later), you might notice something curious about the tokens. They're all DWORDs, and when displayed as hex values they have an easily discernable pattern to them. For example, all of the mdTypeDef tokens are of the form 0x02XXXXXX (for instance, 0x02000042). Likewise, all the mdMethodDef tokens look like 0x06XXXXXX.In addition, if you look at all the mdTypeDefs in an assembly, you'll see that they're monotonically increasing: (0x02000002, 0x02000003, 0x02000004). This is by design. Take a peek at the CorTokenType enum in Figure 5 and it should start to become clearer. All of the token types only use the high byte of the DWORD value to indicate what type of token they are. That leaves the bottom three bytes free to represent specific instances of types, modules, files, member refs, and whatnot. These instances are sequentially numbered, starting at one, and are called Record IDs (RID). To generalize then, the layout of a token is:
When I covered type signatures previously, I described how tokens are used to represent parameters that are .NET types. Another place where you'll see these metadata tokens is in MSIL. For instance, the newobj instruction (opcode 0x73) is immediately followed by an mdMethodRef or mdMethodDef for a constructor. Using the metadata for the constructor, the runtime can determine the class type to create. A final note on tokens: in the .NET headers, you'll see an mdToken type. Each of the CorTokenTypes are instances of this generic token. In many metadata APIs, two or more token types are equally valid. To use a prior example, mdTypeDefs and mdTypeRefs can both be returned as the base class for a class. In this case, the metadata method uses an mdToken parameter rather than a more specific token type. The Reflection APIThe .NET Reflection API provides a higher-level view of the same information exposed by the unmanaged API. If you're new to using the .NET classes, using the Reflection API can be somewhat disconcerting at first. Once you get into the swing of it, though, the Reflection API is surprisingly easy to use. Compared to the unmanaged metadata APIs, much of the tedious work is already done for you.The Reflection API is a full-fledged member of the .NET classes. For instance, every .NET object derives from System.Object. One of the very few methods exposed by System.Object is ::GetType. The ::GetType method returns a pointer to a Type object, which is one of the main entry points into the Reflection API. Given a Type pointer for an object, you can query for the object's methods, fields, and even which assembly it comes from. Since the Reflection API is built atop the unmanaged APIs, there are naturally many parallel concepts. In many cases, when the unmanaged API exposes a set of GetXXX and GetXXXProps methods, there are equivalents in the Reflection API. The unmanaged EnumXXX method becomes a GetXXX call that returns an array of objects. The properties of these objects give you roughly the same information that you'd get from the unmanaged GetXXXProps method. Figure 6 maps the unmanaged API tokens and their equivalent Reflection API types and methods. Be aware that there is often more than one way to get to a reflection class such as a Type *. In Figure 6, I've listed only the most common or intuitive ways to get to the desired Reflection type. You can infer from Figure 6 that the reflection API represents the metadata in a more hierarchical fashion than the unmanaged API. Taking a few simplifying liberties, the metadata hierarchy as seen by the reflection API looks like this:
For me, the biggest conveniences of the Reflection API is its use of managed arrays. As an example, to get the methods that a type provides, you simply call Type::GetMethods. It returns a managed array of MethodInfo pointers. To find out how many MethodInfo pointers are in the array, use the array's Length property. For instance:
Not all flag values are accessed in the Reflection API directly using methods, however. Comparing the CorMethodAttr enum to the previous methods, the following flags don't come through in the Reflection API (at least not directly):
Metadata InspectionThe primary tool that Microsoft provides for examining metadata is ILDASM, which is shown in Figure 8. ILDASM uses the unmanaged API to display the metadata in a hierarchical format. The name ILDASM comes from the fact that it's an MSIL disassembler. If you drill down to the methods and find the method you're interested in, you can double-click on its name to display its MSIL code in a new window.
For this article, I'm not interested in the disassembly capabilities of ILDASM. By running "ILDASM /?", you'll get a help screen that displays all of its command-line options. Among the options is the ability to filter which items are shown, using visibility as the criteria. Another option tells ILDASM to write its output to a file, rather than display it in a window. Conclusion.NET provides a whole host of new services and capabilities. Many of these features depend on accurate, complete, language-independent information about user and system code. The metadata provides that information. In addition, metadata offers a better and more useable way of representing this data, compared to older formats such as IDL and COM type libraries.Microsoft provides APIs for reading and writing metadata. Both reading and writing metadata can be done via traditional unmanaged COM interfaces, or via the Reflection API provided by the .NET common language runtime. In this article, you've seen both the unmanaged and Reflection APIs at work. I've barely scratched the surface of the information accessible via metadata. I've learned quite a bit about how .NET works by carefully studying the metadata, and you can do the same. |
|||||||||||||||||||||||||||||
For related articles see: Microsoft .NET Framework Delivers the Platform for an Integrated, Service-Oriented Web Sharp New Language: C# Offers the Power of C++ and Simplicity of Visual Basic |
|||||||||||||||||||||||||||||
Matt Pietrek does advanced research for the NuMega Labs of Compuware Corporation, and is the author of several books. His Web site, at http://www.wheaty.net, has a FAQ page and information on previous columns and articles. |
From the October 2000 issue of MSDN Magazine.