Understanding Interface Definition Language: A Developer's Survival Guide--MSJ, August 1998

Copyright © Microsoft Corporation. This document is an archived reproduction of a version originally published by Microsoft. It may have slight formatting modifications for consistency and to improve readability.

August 1998

Understanding Interface Definition Language: A Developer's Survival Guide

Bill Hludzinski

IDL is the preferred way to describe your interfaces. However, many developers only have a rudimentary knowledge of IDL. Knowing IDL will help you think about your interfaces in a more explicit manner, which is especially useful now that you spend so much time exposing interfaces.

This article assumes you're familiar with C++, COM, and marshaling.

Bill Hludzinski is a software developer at Vivisoft Corporation (http://www.vivisoft.com) and can be reached at billh@vivisoft.com.

The age of IDL (Interface Definition Language) is upon us. IDL is the preferred way to describe your interfaces. However, many developers only have a rudimentary knowledge of IDL. They rely on the Visual C++ wizards to generate their IDL files, but don't have any idea what the code does. On the other hand, who wants to browse the MIDL Programmer's Guide in their spare time, trying to learn a new language with over 140 keywords? To meet your IDL needs, I am going to provide a basic survival guide that will show you what IDL is, when you need it, and the basics of using it. You'll find that knowing IDL will help you think about your interfaces in a more explicit manner, which is especially useful now that you spend so much of your time exposing interfaces.

Review

IDL was originally part of the Open Software Foundation's Distributed Computing Environment (DCE). It described function interfaces for Remote Procedure Calls (RPCs), so that a compiler could generate proxy and stub code that marshaled parameters between machines. MIDL is Microsoft's IDL compiler. In addition, Microsoft developed its own Object Definition Language (ODL), which included the dispinterface keyword for specifying IDispatch's logical interfaces. ODL scripts could be compiled by MKTYPLIB into type libraries (.TLB files), which could be used by Automation clients for early binding.

With Windows NT 4.0, Microsoft released MIDL 3.0, which merged the two languages by extending IDL (now called COM IDL) to include ODL's features. This enabled the MIDL compiler to generate type libraries as well as proxy and stub code for marshaling. For a discussion of the differences between the older ODL and the MIDL 3.0 COM IDL, see "Introducing Distributed COM and the New OLE Features in Windows NT 4.0" by Don Box (MSJ, May 1996). For a discussion of marshaling, IDL, and the tradeoffs between custom, dispatch, and dual interfaces, I highly recommend Don Box's December 1995 OLE Q&A column. Just remember that the article was written before Microsoft merged MIDL and MKTYPLIB.

So let's look at what MIDL does. By default, if you declare an interface in your IDL file, MIDL will take the declared functions and generate all of the files for an RPC interface. This includes a client proxy, a server stub, and a header file. Now, if you specify the [object] attribute in your interface attribute list, MIDL will instead generate all of the proxy and stub files for a COM interface. This includes the interface proxy file (name_p.c), the type definition header file (name.h), the interface GUID file (name_i.c), and the proxy DLL file (dlldata.c). And finally, for each library block that you define within the IDL file, MIDL will generate a type library (see Figure 1).

Figure 1 MIDL-generated Files

Before proceeding with a discussion of IDL's makeup, let's take a look at just where and when you need to use IDL. Even though it is always good practice to know how to specify your interfaces in IDL, different projects may not have a programmatic need for IDL.
The first reason you might need IDL is so that MIDL can generate the proxy/stub DLL for marshaling data across apartment boundaries. When you create an inproc server that matches the threading model of the creator's thread, you are not crossing any apartment boundaries so no marshaling will occur. In all other cases, such as out-of-process servers or inproc servers that do not have the same threading model as the caller, you need to marshal the data, thus you will always need a proxy or stub. However, you only need MIDL to generate your proxy and stub when you are exposing a custom interface via standard marshaling. You can't use MIDL with custom marshaling because you must implement your own proxy and stub from scratch, and you don't need MIDL with standard interfaces because the proxy and stub already exist as part of the standard marshaler.
The second reason you need IDL is to generate type libraries, which you will need when exposing dispinterfaces. Now even if you go through this exercise and realize that you need IDL, you still may not have to write any IDL because of the Visual C++ 5.0 wizards. The ATL wizards generate the IDL for your proxy, stub, and type library, though in MFC you are not so fortunate. If you create an MFC EXE Automation Server, it generates the ODL file for you to generate your type library (note that MIDL 3.0 can compile legacy ODL files with the /mktyplib203 option), but it will not generate the IDL file needed for a custom interface. Simply put, use ATL for custom and dual interfaces and save yourself a load of time. Figure 2 summarizes this information for out-of-process servers.
With the more general issues explained, I'll look at the most common keywords in IDL from the ground up, starting with the stuff that makes up basic RPC. I'll work my way back to the very COM-specific IDL that's generated for you by the ATL wizards.

Just Like C++!

It's easiest to understand IDL if you look at it from a historical perspective. For many years, interfaces between statically linked modules have been declared in C header files. These header files would include function declarations, as well as any supporting data type declarations. So when developers first thought of generating marshaling code for RPCs automatically, they made a code generator that could read these ubiquitous C header files. In other words, the IDL compiler acts like a C compiler that reads C-style data and function declarations, but instead of generating object code, it generates C source code for marshaling the data types across process boundaries. For example, given the following C declaration


 long AddValues(long val1,
                long val2);

the corresponding IDL file would look like this:


 long AddValues([in] long val1, [in] long val2);

So IDL is rooted in the data type declaration portion of C and C++. It supports all of the standard C++ data types as well as the data definition keywords that you've grown to know and love. More importantly, IDL's data types and definitions are both language-neutral and platform-neutral.

Some of the base data types supported by IDL are int, Boolean, byte, char, double, float, long, short, and void *. IDL also supports the signed and unsigned qualifiers, the const type modifier, and the wchar_t predefined type. As far as data definition keywords go, IDL supports enum, typedef, union, and struct. IDL even supports the comment tokens and the use of preprocessor directives like #define and include (which doesn't use the # prefix). When writing IDL for an API, developers will just take their header files, comment them out, include them with the IDL file, and start uncommenting the functions. IDL and C/C++ really are that close.

Code Generation

After running an IDL file through MIDL and generating proxy/stub source, the IDL file and the resulting source will look similar in some ways. But the MIDL compiler's preprocessor will eliminate the IDL comments and expand the #define statements while processing the IDL file so that they don't appear in the generated C code. If you need to put untouched #define directives, comments, or any C/C++ syntax into the generated code, a handy IDL keyword is cpp_quote, which allows you to place any text directly into the generated header file (including comments if you'd like). So the following IDL line


 cpp_quote("#define UNICODE")

would appear in the generated code like this:


 #define UNICODE

Much of the time your IDL file will need to see data type definitions from other header files and even other IDL or ODL files to process base data types. In these cases, you will have to use the import directive, which actually includes the file in the MIDL processing. The include directive just makes files available to the source code from which the proxy/stub code is compiled. It's important to remember that any function declarations in imported files are ignored, and that in the resulting source code, the generated header file will not directly contain the imported types. Instead an #include directive is generated for the header corresponding to the imported interface. The following import statement


 import "moose.idl"

would be generated in the source code as:


 #include "moose.h"

Since I'm discussing the importing of files, it's probably a good time to mention the standard IDL include files. wtypes.idl, unknwn.idl, objidl.idl, and oaidl.idl ship with Visual C++ and Visual J++. You can find them in their respective include directories. These files document the standard C library of the COM era in one place. They include the data types, #defines, and interfaces that have been defined (and in many cases implemented) by Microsoft for supporting COM.

Of course, Visual C++ ships with even more IDL files for things like ActiveX controls and ActiveX documents, but these four files are the fundamental ones to include in your own interfaces. Figure 3 lists most of the IDL files that ship with Visual C++ 5.0 in order of dependency. The most dependent ones are at the bottom. One of the best ways that you can familiarize yourself with IDL is by browsing through these files.

Your First Attributes

In addition to specifying the basic data types for marshaling, there are many more qualities that IDL can specify about your interfaces that aren't supported by regular C-style syntax. These annotations are formally referred to as attributes. Attributes are grouped together in a comma-delimited list surrounded by brackets, and they always precede whatever object they are describing.

In the following function declaration, there are separate attributes listed for each of the parameters in the function call, with two attributes listed for the third parameter:


 void fx([in] long l1, [out] long *pl2,
         [in, out] long *pl3);

In this case, the attributes are all applied to the arguments of this function. However, attributes are not limited to arguments alone. They can be applied to other things such as functions or libraries. [in] and [out] are two of the fundamental attributes that IDL makes available.

Any data that is to be marshaled remotely must be placed into a data packet and transmitted across the network, so the actual transfer of function parameters can be an expensive operation. To save work, IDL supplies the [in] and [out] attributes for explicitly describing which way parameters need to be copied. A parameter specified as an [in] parameter will only be copied from the client to the server. Likewise, a parameter specified as an [out] parameter will only be copied from the server back to the client. A parameter specified as [in,out] will be copied in both directions. If inbound data is to be modified and returned to the caller, it is done by passing in a pointer to where the resulting data is to be placed. For this reason, [out] parameters must always be pointers.

This brings me to the next item at hand, marshaling pointers and how they differ from your typical pass-by-value arguments. For example, an integer is marshaled by simply copying the value from the address space of the calling process to the address space of the destination process. But a pointer to a string would be marshaled by copying the data that the pointer points to, not the pointer. For out-of-process calls to be transparent, whatever exists in the client's address space at the time of the call must be recreated in the server's address space.

Take a look at the following IDL example:


 long fx([in] const long *pval);

If a client called this function, passing the address of a variable whose value was 17, the value 17 would be sent across the network from the client's proxy to the server's stub. On the server, the stub code would copy the value 17 into the server's address space and call the server function with a pointer that pointed to that value (see Figure 4).

Figure 4 Marshaling a Pointer

Now imagine that the caller invoked the function with a null pointer instead of a valid pointer. The proxy would need a special tag for null pointers that it passed to the server. The server stub would then translate that token back to a null pointer on the server side and call the server function with the null pointer.
It is common for functions to disallow null pointers as parameters, checking the pointers to see if they are null, and if so, returning with an error code. That's fine for in-process code, but when performing RPC across a network, these round-trips are much too expensive for frequent use. So somebody thought of a way to prevent this all-too-common occurrence by making the pointer declarations more explicit. If you could disallow null pointers ahead of time, the marshaling code could catch illegal null values and return an error code before any data has to be sent across the wire. This pointer, called a reference pointer, is the simplest type of pointer in IDL. It must always reference a valid address; if it is null, the marshaling code will return with an error. To indicate that a pointer is a reference pointer, you must use the [ref] attribute.


 // pval cannot be null
 long fx([in, ref] const long *pval);

Pointers that can be null are called unique pointers and are indicated by the [unique] attribute.


 // pval can be null
 long fx([in, unique] const long *pval);

There is another characteristic of pointers that marshaling must emulate as well: duplication. If you look at the following IDL example, you will notice that the function takes two pointers:


 long fx([in] long *pval1, [in] long *pval2);

Now consider the following client code:


 long l1 = 10;
 long l2 = fx(&l1, &l1);

This code is noteworthy because it passes the same pointer twice in the same function invocation. This means that the value pointed to by both pointers will be passed across the network twice, once for each value. When the stub unmarshals the parameters on the server side, it will allocate two distinct memory blocks, one for each value, and set each pointer to point to its value's newly allocated memory. So on the server side, even though both pointers will point to memory with a value of 10, they will be pointing to different memory locations, unlike on the client side.

This side effect can result in erroneous execution if the behavior of the server code actually takes into account the equivalence of the two pointers. IDL can handle this through the use of full pointers. A full pointer comes closest to fully simulating C pointers because its interface marshaler performs duplicate pointer detection and will make sure that duplicate pointers are unmarshaled as duplicate pointers, pointing to one memory location instead of two. A full pointer may be used by declaring a pointer with the full attribute:


 long fx([in, full] long *pval1,
         [in, full] long *pval2);

It's important to note that full pointers incur a reasonable performance penalty searching for duplicate pointers, so only use them in place of reference pointers or unique pointers when the semantics of the function take duplicate pointers into account.

Believe it or not, the issue with pointers and memory management gets even more complex. When you pass [in] values into a function via a pointer, things are still simple. The caller will always allocate the memory, point the pointer to the memory, and put the value in that memory before calling the function. However, with an [out] parameter, things aren't so straightforward because the caller isn't providing a value, so you might be tempted to pass in a null pointer and have the callee allocate the memory.

This opens up a can of worms. How can the caller free the memory? This can be done by the COM task allocator which I'll talk about later. Suffice it to say that things are much easier when you depend on the caller to always allocate the memory for [out] parameters. For this reason, caller-allocated memory is the standard for [out] parameters in COM. This means that an [out] parameter should never be null, and hence should always be a reference pointer.

Things get even more interesting when user-defined types begin to contain pointers. Look at the following IDL:


 typedef struct tagELEMENT  {
     long lValue;
     [unique] struct tagELEMENT *pPrev;
     [unique] struct tagELEMENT *pNext;
 }   ELEMENT;

 void GetElementList([out] ELEMENT *pList);

This example contains pointers within a user-defined structure. These types of pointers are embedded pointers, whereas your typical pointer (which isn't embedded in a structure) is a top-level pointer. Let's say that the purpose of this function is to return a doubly linked list of elements to the caller.

First, let's assume that the caller acts appropriately and allocates memory for a single ELEMENT, then passes its address into the GetElementList function. Because there was only one element in the list, the caller left pNext and pPrev as null. This violates the rule about passing null pointers for [out] parameters because all allocation should be done by the caller. But if you think about it, how could the caller know up front how many elements to place in the list if that's what GetElementList is to return? Besides, you use null pointers to indicate the top and bottom of the list. This isn't what you want. With embedded pointers, you want them to be able to be null going in, and you want the function to be able to allocate the memory for the elements, setting up the list by assigning the pPrev and pNext pointers appropriately.

To handle this situation, embedded pointers are treated differently. It makes more sense for the callee to allocate the memory since the caller can never know how much to allocate. Thus, you use the [unique] attribute to allow embedded pointers to be null going in, and assigned to memory on the function return. However, for top-level pointers the old logic still applies, and they must still be made reference pointers. If you look at the previous code, you will notice that the pPrev and pNext pointers both have the [unique] attribute, yet the pList parameter in the GetElementList function call would still default to a reference pointer as an [out] parameter in the function declaration. This makes all of the pointers legal, but leaves a more serious problem.

Embedded pointers use callee-allocated memory. But if the callee allocates the memory, how can the caller free it after the function call? The caller doesn't know how much has been allocated, and even if she did know, there is still the issue that memory has been allocated on both the server and client side. The caller has no idea how to free memory on the server side. There needs to be some way that a callee can allocate memory that the caller can free—and there is.

This is where the COM task allocator comes to the rescue. The COM task allocator is a per-process memory allocator that is used precisely for allocating memory to be shared between processes on either side of an interface. The task allocator is an implementation of the IMalloc interface, but is typically used via the following three COM API convenience functions:


 void *CoTaskMemAlloc(ULONG cb);
 void *CoTaskMemRealloc(void *pv, ULONG cb);
 void CoTaskMemFree(void *pv);

When you're working with embedded pointers and linked user-defined types, you will be using callee-allocated memory, and these functions will help. Further discussion of the COM task allocator is beyond the scope of this article, but if you'd like to see it explained and put into use, refer to Don Box's OLE Q&A column in the October 1995 issue of MSJ.

Strings and Arrays

Basically, marshaling a string is easy because of the [string] attribute. This attribute lets the marshaler know that a pointer points to a null-terminated character array. As a result, the marshaler knows how much data it has to copy to the server process. Even though the [string] parameter can be used for null-terminated arrays of bytes or chars, COM is based on 16-bit Unicode characters, so for all COM methods you will want to base your strings on OLECHAR type characters. OLECHAR is defined in wtypes.idl, and is a platform-independent typedef for wchar_t.


 void MyFunction([in,
     string] const OLECHAR
     *pwszName);

It is interesting to note that with the [string] attribute, the marshaler behaves as it should and will copy the data pointed to until it hits a null. Without the [string] attribute, the data pointed to would have been treated as a single OLECHAR and only two bytes would have been copied by the marshaler. This point is significant when discussing arrays. Without an attribute or some sort of qualifier telling MIDL otherwise, pointers are assumed to point to single instances of a data type.

The simplest form of an array can be passed by fixing its size at design time using C array syntax:


 void Fx([in] long alValues[4]);

This example passes a fixed array of four longs with which the marshaler can easily figure out how much data to copy to the server process. However, the most common case is where the size of the array will not be known until runtime. In this situation, IDL provides a series of attributes for specifying the array's size at compile time or runtime. These types of arrays are called conformant arrays, and the size of the array may be defined via the [size_is] attribute. Typically, you will use one of the other arguments in a function to specify the array size using [size_is].


 void Fx([in] long cItems,
         [in, size_is(cItems)] short aItems[]);

You may have noticed that I used C-style variable-length syntax for aItems; a pointer works just the same. The [max_is] attribute may be used in the same situations where you use [size_is]. [size_is] indicates the maximum number of elements in an array, while [max_is] indicates the maximum value for a valid array index. So if you use [max_is] on an array of size n, where the first array element starts at zero, then [max_is] would be set to the maximum valid index, which would be n–1. Both of the attributes use constants for the array size. This is not recommended because it is slower than using a fixed array.

There is another case that I must cover: arrays used as [out] parameters. Imagine that the caller wants to pass in a caller-allocated array that is empty and have the callee fill it up with valid values. This can be done by simply passing in the array with [out] and [size_is] attributes, but that would be inefficient. What if the callee function uses only one-third of the elements in the list? The marshaler would still marshal the entire array back to the caller function's proxy. To get around this problem, IDL has the varying array, which will only transmit back to the caller the array elements that are being used. This is accomplished via the [length_is] attribute, which is only used for [out] parameters. The number used by [length_is] to define the contents of an array is called the variance of the array.

The following example illustrates how the [size_is] and [length_is] attributes may be used together to explicitly specify how many array elements need to be marshaled.


 void Fx([in] long cMax,
         [in, out] long *pcUsed,
         [in,out,size_is(cMax),
         length_is(*pcUsed)] long *aValues);

On the way in, the [size_is] attribute lets the marshaler know that the array is cMax longs so the stub will allocate the required memory on the server. But the [length_is] attribute tells the marshaler that only *pcUsed longs need to be marshaled, so they are the only elements of the array that are actually transmitted to the server. Quite efficient!

The [size_is] attribute has [max_is] as its counterpart. They are the same except [max_is] specifies the array size by defining the maximum valid index. [length_is] and [last_is] have a similar relationship. They are the same except [last_is] specifies the number of elements used in the array by defining the last index used by an element in the array. Once again, an array specified as [length_is(n)] is equivalent to one specified as [last_is(n–1)]. To make things even more flexible, IDL also has a [first_is] attribute that can be used to define the index at which the array begins to be used. So in the following example, out of a 100-element array, only 11 elements are actually transmitted—elements 12 through 22.


 void Fx([in, size_is(100), first_is(12),
         last_is(22)] long *aValues);

For another look at strings and arrays in IDL, check out Don Box's November 1996 ActiveX/COM column. He covers the techniques in greater detail, as well as multidimensional arrays and performance comparisons between the different array-passing techniques.

Type Libraries

The majority of what I've covered so far has been around since the inception of DCE RPC. When most people think of type libraries, they think of Automation. Everyone seems to think that type libraries are just for supplying IDispatch logical interface definitions to a client. Well, type libraries can do that, but they can also do much more. Type libraries can be used to describe vtable interfaces, data types, regular functions, COM components, and even DLL modules. Keep in mind that type libraries are the compiled metadata and IDL is the source.

When you use MIDL to compile your IDL into type libraries, the TypeLib Viewer in OLEVIEW is like your decompiler. The Visual Basic Object Viewer does the same thing, but displays the interfaces in a less explicit manner. Figure 5 shows the IDL for the DAO type library from within the TypeLib Viewer. Here I opened dao350.dll because the type information was stored as a resource in the DLL, but in many cases the type library is available as a separate .TLB file. In fact, I don't know why OLEVIEW doesn't do this automatically, but it's exceptionally useful so that you can just click on the type libraries in Windows Explorer and see the IDL immediately to associate the .OLB and .TLB extensions with OLEVIEW. (Do a Find on *.TLB, right-click a .TLB file, select Open With, and choose C:\Program Files\DevStudio\ Vc\Oleview.exe as the viewer).

If you develop in Visual Basic, you live and die by type libraries. Especially when you are developing components in C++ and clients in Visual Basic, being able to see explicit definitions of the actual interface can save untold hours of work. When your COM object's interface doesn't work from Visual Basic, the interface has to change to accommodate Visual Basic.

Type libraries are repositories for metadata that describe interfaces and data types in a potentially rich manner. They are much more fundamental than Automation. As tools begin to take advantage of this interface metadata, development will become much easier because the metadata takes the burden off the developer and puts it on the tools. Instead of requiring programmatic access to services, the developer can take a declarative approach to accessing services, specifying an attribute instead of coding.

You can see this kind of functionality in Visual C++ 5.0 and Visual Basic 5.0. In Visual C++, you can use the MFC Class Wizard to add classes based on dispatch interfaces defined in type libraries instead of writing the classes yourself. The #import directive in Visual C++ imports type information from a type library and generates C++ classes for accessing those COM interfaces defined in the library.

Visual Basic is an even more sophisticated user of type libraries, and can take advantage of all that they have to offer. For instance, I regularly switch between Visual Basic and Visual C++. The Auto Quick Info feature in Visual Basic makes using Visual C++ seem tedious in comparison. Figure 6 shows how the type library information appears in Visual Basic.

MIDL does not create type libraries automatically. To instruct MIDL to make a type library, you must use the library statement. Like any other statement, this one must end with a semicolon. Following the library keyword is the name of the library and two curly braces, which contain everything meant for the type library. The library statement can have attributes, and even has one required attribute: [uuid]. UUIDs hail from the DCE RPC world and were the precursor to GUIDs. For all intents and purposes, they are the same. The [uuid] attribute specifies the GUID to uniquely identify the type library, and MIDL will not generate the library without this. You can use GUIDGEN to make one.

A type library can contain the following five categories of elements: typedefs, modules, interfaces, dispinterfaces, and coclasses. IDL has a statement for each, and they all appear within the braces of the library statement. IDL typedefs work just like C/C++ typedefs. All of the typedef information can go into a type library as long as the typedef appears within the library statement's braces.

The module statement is used to define a DLL module, including names and ordinals for exported functions. Like the library statement, a module has braces, and all function prototypes that appear within those braces are associated with that module. A module statement may also have several attributes. The first one, [dllname], is required and specifies the name of the DLL. All functions that are to be exported from a module also need to supply an [entry] attribute for the function. If given a number, the [entry] attribute assigns the function an ordinal entry point. If given a string, the [entry] attribute assigns the function a name as an entry point. For both modules and functions, the [helpstring] attribute allows you to associate a help string with that function or library.

Now you have enough information to make your first type library from scratch. Let's say you put the following IDL in a file called user.idl:


 [uuid(54674299-3A82-101B-8181-00AA003743D3)] library MyLib
 {
     typedef enum {
         btError       = 0x0010,
         btQuestion    = 0x0020,
         btWarning     = 0x0030,
         btInformation = 0x0040
     } BeepTypes;

     [dllname("USER32")]
     module MyUser32
     {
         [entry("MessageBeep"),
             helpstring(
             "Makes the sound
             specified by btSound")]
         long _stdcall
         MessageBeep(BeepTypes
             btSound);
     };
 };

I've defined a type library called MyLib, which contains an enumeration definition and a DLL interface. The DLL's file name is USER32.DLL, but I am calling the module MyUser32. This module has one entry point: the MessageBeep function. Note that the typedef is within the library statement's braces, but not the module. You can compile it into a type library by running MIDL with the following command line:


 midl user.idl

After it's compiled, you can look at the type library with TypeLib Viewer. You can also import the type lib- rary into a Visual Basic-based project by choosing Projects | References, pressing the Browse button, and selecting USER.TLB. After the reference is added and you look in the Object Browser, you will find the module name (in this case MyUser32) appearing in the classes pane on the left. In the members pane on the right, you will see the functions and types exposed for this module. In the bottom pane, you will see the function declaration complete with the help string that I supplied for the function (see Figure 7).

Figure 7 MyUser32 Info

As you learn how to use IDL to generate type information, you will begin to realize just how fully Visual Basic embraces it. Using Visual Basic, create a form, place a command button on it, and place the following code in the command button click handler:


 Private Sub Command1_Click()
     Dim Sound As BeepTypes
     Sound = btQuestion
     MessageBeep (Sound)
 End Sub

As you type in the code, you will notice that Auto List Members has picked up your type information and can offer you choices for the enumeration values (see Figure 8). When you type in the function call, you will see that Auto Quick Info lists all of the data types and data members for you (see Figure 9).

Figure 8 Auto List Members

Figure 9 Auto Quick Info

The name I gave to the parameter in MessageBeep was btSound. Visual Basic can automatically use this as a named parameter, so if the IDL declaration had more than one parameter


 long _stdcall MessageBeep(int iVolume,
                           BeepTypes btSound);

the Visual Basic code could make the call with the parameters out of order:


 Private Sub Command1_Click()
     MessageBeep (btSound:=btQuestion, iVolume:=100)
 End Sub

Integration also occurs with the [optional] attribute. If this attribute is applied to a parameter in the IDL file, Visual Basic treats the parameter as if it were declared with the Optional keyword.

Interfaces

The interface statement causes MIDL to generate proxy/stub source code for a remote interface. The interface statement is preceded by any attributes and followed by the interface name and curly braces. All of the elements of the interface must be within those braces. By default, MIDL will generate the files necessary for an RPC interface, but if the interface statement has the [object] attribute, MIDL will generate the code for a COM interface instead. Just like type libraries, an interface requires the [uuid] attribute to uniquely identify the interface. Without it, MIDL will not generate the proxy/stub code. The following is the complete IDL source to generate a simple one-function RPC interface:


 [uuid(348ACF20-C9B9-11d1-ABE5-966A46661731)]
 interface MyRPCInterface
 {
     void Fx(int iValue);
 }

COM interfaces have a few more requirements. Once the [object] attribute is specified, you are required to derive your interface from another interface using standard C++ syntax. All interfaces must derive somewhere from IUnknown. Another requirement is that all COM interface methods must return HRESULT so that all methods may return error values in response to network failure. Because IUnknown is not known to MIDL, you must import unknwn.idl, and you must import wtypes.idl for HRESULT. The following is the complete IDL source to generate a simple COM interface proxy and stub:


 [object, uuid(348ACF20-C9B9-11d1-ABE5-966A46661731)]
 interface IDerivedInterface : IUnknown
     {
         import "unknwn.idl";
         import "wtypes.idl";
         HRESULT Fx(int iValue);
     }

If you place an interface inside of a library, proxy/stub code will not be generated for it, though its type information will be included in the type library. Figure 10 shows the COM interface in the type library that I created earlier.

To generate networking code for an interface, just leave it out of the library statement. If you have a function defined in an interface that is outside the type library but you do not want to generate proxy/stub code for it, you can use the [local] attribute on that function to suppress the generation of networking code. This also frees the function from needing HRESULT as its return type.

So now that you are defining interfaces, you can reference them as well. You can use the interface keyword with the typedef keyword to define an interface data type:


 typedef interface IStorage *LPSTORAGE;

You can even pass interface pointers as function parameters:


 HRESULT StoreData([in] IStorage *pstg);

However, sometimes you will not know the interface type at design time, so IDL provides support for dynamically typed interfaces with the [iid_is] parameter attribute:


 HRESULT CreateInstance([in] REFIID riid,
                       [out, iid_is(riid)] void **ppv);

In the code that called this method, the riid parameter would take the IID of the dynamically typed parameter:


 CreateInstance(IID_Car, &ICar);

Coclasses

The coclass statement is used to define a component object and the interfaces that it supports. The coclass statement is similar to the interface statement, and also requires the [uuid] attribute (which will hold the object's CLSID). The object can have any number of interfaces and dispinterfaces listed in its body, specifying the full set of interfaces that the object implements, both incoming and outgoing. Here's a sample coclass:


 [uuid(0D248C00-CA6D-11d1-ABE5-8DDA2C299A21)] library MyLib2
 {
     [object,
     uuid(0D248C01-CA6D-11d1-ABE5-8DDA2C299A21)]
     interface IDerivedInterface : IUnknown
     {
         import "unknwn.idl";
         import "wtypes.idl";
         HRESULT Fx(int iValue);
     };

     [uuid(CE395E80-CA6C-11d1-ABE5-8DDA2C299A21)]
     coclass Drone {
         interface IDerivedInterface;
     };
 };

The [source] attribute can be used to signify that a member of a coclass—whether an interface, property, or method—is a source of events, which means that it implements IConnectionPointContainer. With properties and methods, the [source] attribute means that the member returns an object or VARIANT that is a source of events. In this example, the IDerivedInterface interface is made an event source:


 [uuid(CE395E80-CA6C-11d1-ABE5-8DDA2C299A21)]
     coclass Drone {
         [source] dispinterface IDerivedInterface;
     };

Dispatch Interfaces

The dispinterface statement is used to create the type information for an IDispatch logical interface that will be executed via IDispatch::Invoke. Because it derives from IDispatch, you should import the Automation type library (stdole32.tlb) to get in all of the types.

The dispinterface keyword is similar to the interface keyword in that it requires the [uuid] attribute and lists all of its methods within the braces following its name. However, these keywords differ because dispinterface has two sections inside the body of the definition: properties and methods. The properties section lists variable declarations of Automation-compatible types, and the methods section contains method declarations of Automation-compatible types. Every method and property in the dispinterface is also required to have the [id] attribute, which is used to assign a DISPID to each property and attribute.


 importlib(stdole32.tlb)

 [uuid(348ACF20-C9B9-11d1-ABE5-966A46661731)]
 dispinterface DFlyingSaucer
 {
     properties:
         [id(1)] long Altitude;
         [id(2)] long Speed;
     methods:
         [id(3)] HRESULT Land([in] long lDirection);
         [id(4)] HRESULT TakeOff();
 }

A much more important difference is that a dispinterface doesn't have a vtable. All dispinterface methods and properties are accessed using an index with Invoke.

IDL also provides an easier way to define dual interfaces—by applying the [dual] attribute to the regular interface statement for an interface derived from IDispatch. Defining the interface this way defines the vtable, and then the dispinterface is taken from that same definition. Here's the same interface as in the previous example, but defined the new way:


 importlib(stdole32.tlb)

 [object, dual,
 uuid(348ACF20-C9B9-11d1-ABE5-966A46661731)]
 interface DIFlyingSaucer : IDispatch
 {
     [id(1), propget] HRESULT Altitude(long);
     [id(1), propput] HRESULT Altitude(long);
     [id(2), propget] HRESULT Speed(long);
     [id(2), propput] HRESULT Speed(long);
     [id(3)] HRESULT Land([in] long lDirection);
     [id(4)] HRESULT TakeOff();
 }

Note the naming convention; dispatch interfaces have a D prefix, dual interfaces have a DI prefix, and regular interfaces have an I prefix.

Because IDispatch is a standard interface, Microsoft provides a standard marshaler for marshaling parameters. This marshaler is special because it not only marshals dispinterfaces, but it can also marshal vtable interfaces that meet certain criteria. For this reason it is referred to as the Universal Marshaler or the type library marshaler. No other marshaler is needed if the type library is installed on both the client and server machines.

This marshaler can marshal vtable interfaces if they meet the following criteria: the interface uses Automation (IDispatch)-compatible data types (anything that can go into a VARIANT); the interface has a type library where the IDispatch marshaler can get the interface information; and the interface has the appropriate registry entries that identify its type library and that this interface is using the Universal Marshaler as its proxy and stub.

The effects of this are great. You don't need MIDL to generate the proxy and stub, so you don't need to declare the interfaces outside of the library block in the IDL file. However, the trick is getting the registry entries into the registry. Fortunately, MIDL will do it for you, but you have to follow closely.

Let's consider three scenarios: a dispinterface, a custom interface with the [dual] attribute, and a custom interface. Any interface that is declared in the file as a dispinterface is implicitly Automation-compatible and will automatically get the Universal Marshaler entries put in the registry. Specifying [dual] on a custom interface implies that the interface is compatible with Automation, and so both the custom interface and the dispinterface will use the Universal Marshaler and will get the entries placed in the registry.

A custom interface can use the marshaler and will get the entries placed in the registry if that interface is Automation-compatible and uses the [oleautomation] attribute. The [oleautomation] attribute is the key here; that attribute is what turns on the Universal Marshaler for a custom interface. So basically, the only interfaces for which you need MIDL to generate a proxy/stub DLL are custom interfaces that are not Automation-compatible.

Conclusion

IDL can be a little confusing because of the overlap in marshaling and type information keywords (see Figure 11 for a summary of the fundamental IDL attributes). But if you're currently doing any COM-based programming, you need to learn IDL. An understanding of IDL will help you with marshaling issues and overall component design and integration. IDL is an integral part of COM development, and it also enables you to do some pretty cool things with sophisticated type-aware tools like Visual Basic. Learning IDL now will only serve to put you on a stronger foundation for the future.

From the August 1998 issue of Microsoft Systems Journal.