Now
that you've seen the main structure of the __delayLoadHelper function,
let's look at the notification hook functions that permeate its code. As
__delayLoadHelper executes, it has provisions to call a user-supplied
notification function at particular times. The notifications occur on
these occasions: when the __delayLoadHelper function begins, before
calling LoadLibrary, before calling Get
ProcAddress, and when the __delayLoadHelper function finishes
processing.
If
you provide a notification hook function, you can make it return values
that short circuit some of the __delayLoadHelper code. For example, by
returning a FARPROC address for the pre-GetProcAddress notification,
__delayLoadHelper will use your value rather than calling
GetProcAddress. Likewise, by returning an HMODULE from the
pre-LoadLibrary notification, you'll bypass the LoadLibrary call that
would ordinarily occur.
In
the normal case that I've just described, notification hook functions
won't actually be called because the function pointer used to call the
notification hook is NULL. However, you can write a notification
function in your code and override the function pointer to point at your
notification function. You'll see this in the sample program later.
The notification hook function is prototyped like this:
FARPROC WINAPI
PfnDliHook(unsigned dliNotify, PDelayLoadInfo pdli);
The dliNotify parameter is one of the dliXXX
enums specified in DELAYIMP.H. The pdli parameter is a pointer to a
DelayLoadInfo structure that is also declared in DELAYIMP.H. The
DelayLoadInfo structure contains everything you need to know about this
particular API, including the API and DLL names. The DelayLoadInfo
structure is constructed on the fly in the __delayLoadHelper function.
Beyond
normal notification hooks, the delay-load code can also call a second
function in the event of an error (for example, if the DLL wasn't
found). This error hook function takes exactly the same arguments as the
regular notification hook function. The only difference is that the dliXXX enums indicate failure states (dliFailLoadLib or dliFailGetProc).
As
with the notification hook functions, there is no default
implementation for the failure hooks. To provide one, write your own
failure function, then declare a global variable named
__pfnDliFailureHook that points to your failure function. The linker
will use your __pfnDliFailureHook variable rather than the one in
DELAYIMP.LIB, which contains a NULL pointer. By making your failure hook
function return appropriate values, you can recover from the failure
gracefully. For instance, if you receive a dliFailLoadLib failure
notification, you might prompt the user for the location of the DLL and
call LoadLibrary yourself. The resultant HMODULE would then be used as
the failure hook return value.
The Linker Gets into the Act
Earlier,
I mentioned that when using /DELAYLOAD,
the linker gets into the code-generation business. However, everything
I've described so far with the __delayLoadHelper function and the hook
functions doesn't require any assistance or code from the linker. So
what does the Visual Studio® 6.0 linker do that actually makes this whole thing possible?
When
you use /DELAYLOAD on a given DLL, the linker generates two different
types of stubs. The first of these stubs is the "per-API" stub. The
linker generates one of these stubs for each API called in the imported
DLL. The linker assigns a name in the form of __imp_load_XXX for the stub, where XXX is the API name. For example, the per-API stub for a call to GetDesktopWindow looks like this:
__imp__load_GetDesktopWindow@0:
PUSH ECX
PUSH EDX
PUSH __imp__GetDesktopWindow@0
JMP __tailMerge__USER32
This small snippet of stub code is worth scrutinizing
carefully. The first two instructions (PUSH ECX and PUSH EDX) preserve
the values of the ECX and EDX registers on the stack. The next
instruction (PUSH __imp__GetDesktopWindow@0) pushes the address of the
pseudo IAT entry for the GetDesktopWindow function. When I described the
__delayLoadHelper code earlier, I mentioned that it patched the pseudo
IAT entry before returning. This PUSH instruction is where
__delayLoadHelper gets the location to patch. Before a delay load API is
called for the first time, the pseudo IAT entry points to this per-API
stub. This is how control gets to this stub rather than to the target
API.
The
final instruction of a per-API stub points to the second type of
linker-generated stub ("per-DLL"). Thus, no matter how many functions
you delay load from USER32.DLL and COMCTL32.DLL, there will still only
be two stubs—one for USER32.DLL and the other for COMCTL32.DLL. The
linker names these stubs __tailMerge_XXX, where XXX is the name of the DLL. For example, the stub for USER32 looks like this:
__tailMerge_USER32:
PUSH __DELAY_IMPORT_DESCRIPTOR_USER32
CALL ___delayLoadHelper@8
POP EDX
POP ECX
JMP EAX
The
first instruction of the per-DLL stub pushes the address of a data
structure that the linker has included elsewhere in the executable. This
struct is of type ImgDelayDescr, defined in DELAYIMP.H. The
ImgDelayDescr struct contains pointers to the DLL name, pointers to the
DLL's pseudo IAT and INT, and various other items needed by the
__delayLoadHelper function. This is the data structure pointed at by the
IMAGE_DIRECTORY_ENTRY_DELAY_
IMPORT slot in the executable's DataDirectory.
Here's
an important side note to people writing PE file utilities. All the
pointer values in an ImgDelayDescr are virtual addresses (that is,
normal linear addresses that can be used as pointers). The use of
virtual addresses is in sharp contrast to the relative virtual addresses
(RVAs) used by the IMAGE_IMPORT_DESCRIPTOR structure for regular
imports. This use of virtual addresses rather than RVAs is
unprecedented. I assume that this is because the notification hooks are
passed a pointer to the ImgDelayDescr, so it wouldn't do to have hook
implementors using RVAs.
The
next instruction of the per-DLL stub makes the actual call to
__delayLoadHelper. The __delayLoadHelper returns the address of the
target API in the EAX register. The next two stub instructions restore
the ECX and EDX registers that were put on the stack by the per-API
stub. The last instruction JMPs to the address in the EAX register (that
is, the value returned by __delayLoadHelper function). Since
__delayLoadHelper patched the pseudo IAT entry, the per-API stub only
executes once. Subsequent calls go through the pseudo IAT directly to
the target API. Likewise, the per-DLL stub only executes once for each
delay-loaded API.
|