Don't
be concerned if you see the same DLL appear more than once in the list.
ModuleList treats each instance of a DLL with a different load address
as a separate instance. I intentionally kept the DLL names in
alphabetical order to make it easier to notice when this occurs.
When
I set out to write ModuleList, I had a seemingly simple goal: the
program should run on as many versions of Windows NT and Windows 9x
as possible. This would have been much easier if ModuleList didn't need
to run on Windows NT 4.0. However, as I write this column, Windows NT
4.0 is still in widespread use as a development platform.
In
an ideal world, I could just use the ToolHelp32 APIs, which provide all
the capabilities I need. Alas, ToolHelp32 doesn't exist on Windows NT
prior to version 5.0. This presents two problems. First, I can't
directly call any ToolHelp32 APIs. Second, a means of getting the
equivalent information for Windows NT 4.0 is needed. Luckily, PSAPI.DLL
(which I wrote about in my August 1996 column) comes to the rescue.
By
using a combination of ToolHelp32 and PSAPI.DLL APIs, I can access all
the information I need to build the DLL list. Unfortunately, I can't
just call the appropriate APIs directly based upon the operating system
ModuleList is running on. By calling the APIs directly, I'd create an
implicit reference to those APIs in the ModuleList executable. Since the
ToolHelp32 APIs aren't in Windows NT 4.0, and since PSAPI.DLL won't
load on Windows 9x, I'd create a program that wouldn't run on any operating system.
The
way to overcome this problem is to bite the bullet and go through the
drudgery of using GetProcAddress to obtain function pointers to the
appropriate APIs, based upon the underlying operating system. In Windows
NT 5.0, there's a new feature called Delay Load import descriptors that
sounds like it could solve this problem. That is, the operating system
lets you put off loading a DLL and hooking up to its APIs until you
actually call the API. However, since this feature doesn't exist in
Windows 9x and prior versions of Windows NT, it doesn't do you much good now.
The ModuleList Code
The central point for the ModuleList code is ModuleList.CPP (see Figure 2).
Most of the code is boilerplate dialog code, which I won't waste time
on. Instead, let's focus on the PopulateTree function. After clearing
the contents of the tree view, the function resets global instances of a
ModuleList class and a ProcessIdToNameMap class. Next, PopulateTree
adds data to these class instances by calling the
PopulateModuleList_ToolHelp32 or PopulateModuleList_PSAPI functions as
necessary. Finally, PopulateTree walks through all the items of the
ModuleList class and adds the relevant information to the TreeView
control.
Before
getting into how the ModuleList and ProcessIdTo-NameMap classes are
filled, let's first look at the classes themselves. All of the code for
the classes can be found in Module-List-Classes.H and
ModuleListClasses.CPP (see
Figure 2).
The ModuleList class is just a linked list-based container class for
ModuleInstance class instances. The ModuleList class has member
functions to add a new module, enumerate through all the
ModuleInstances, and look up a ModuleInstance given a base address
(HMODULE) and file name.
The
ModuleInstance class represents each loaded module that has a unique
filespec and load address. In addition to storing the HMODULE and
filespec, the ModuleInstance class also keeps a list of process IDs that
reference the module. There are methods to add a new process ID to the
list, enumerate through the process ID list, and retrieve the number of
referencing processes.
The
final member of this set of classes is the ProcessId-ToNameMap class.
Its sole reason for existence is to translate a process ID into a
filespec for the executable associated with the process (in essence, the
process name). The implementation of this class is rather crude, using a
dynamically grown array and brute force scanning algorithms. Yes, using
the STL Map class would be more elegant. However, I still spend more
time wrestling with STL-induced compiler errors than I save by using the
STL in a simple program like this.
The
third and final source file from the ModuleList program is
ModuleListOSCode.CPP. This is where I isolated all the code that's
specific to a particular operating system. There are only two functions
in this module: Popu-late-ModuleList_ToolHelp32 and PopulateModuleList_
PSAPI. Both functions take references to empty ModuleList and
ProcessIdToNameMap class instances and fill them up. Immediately
preceding both functions is a series of typedefs. I needed all these
typedefs so that I could use GetProcAddress and call the PSAPI and
ToolHelp32 APIs through function pointers, thereby avoiding an implicit
reference to the APIs.
The
PopulateModuleList_ToolHelp32 function looks up the addresses of the
five ToolHelp32 APIs it will use. It then creates a ToolHelp32 snapshot
of the process list. Using this snapshot, the function iterates through
each of the processes. At each stop, it creates a module list snapshot.
As the code enumerates through the module list snapshot, it fills in the
ModuleList and ProcessIdToNameMap classes that were passed to it. The
function is also careful to call Close-Handle on each snapshot when it's
done using the snapshot.
The
PopulateModuleList_PSAPI function looks up the addresses of the three
APIs in PSAPI.DLL that it needs. The code then calls EnumProcesses to
obtain an array of process IDs. Next, the code iterates through each of
the process IDs and calls OpenProcess to get a corresponding process
handle. If a process handle can be obtained (which isn't always the
case), the function uses EnumProcess-Modules to get an array of all
HMODULEs in the designated process. By itself, an HMODULE in another
process is almost useless. Luckily, PSAPI.DLL has the
GetModule-FileNameEx API (in both ANSI and Unicode) to retrieve a file
name from a process handle and HMODULE combination. I used the ANSI
version (GetModuleFileName-ExA) since the rest of the program is
ANSI-centric.
In
both of these functions, you might notice a minor flaw: not all of the
necessary information is collected at one time. With ToolHelp32,
multiple module list snapshots are taken during the process enumeration.
Likewise, the PSAPI-based function has to use a series of calls to
EnumProcess-Modules. The hangup in both cases is that, during the
enumeration, a DLL could load or unload. Even worse, a process could
start or terminate. Either way, the results wouldn't be entirely
consistent. Short of somehow suspending all other processes while the
enumeration occurs, this potential loophole can't be avoided.
A DLL Mystery
Take another look at Figure 1.
Notice that the highlighted line is spoolss.exe, which is the Windows
NT print spooler subsystem. The DLL that it references is MSDBI.DLL. The
description for MSDBI.DLL is "Microsoft® VC Program Database." In simpler terms, this is the Visual C++®
DLL that reads debug symbol tables. What in the heck would a print
spooler need to use a symbol table for? There isn't a reason. As a
result, tracking down why spoolss.exe loads MSDBI.DLL is an interesting
exercise.
If
you're lucky, the program's EXE file or some DLL will implicitly link
to the DLL in question. When this happens, you can use a module
dependency-listing program to ferret out the connections. One such
program is Depends.exe from my February 1997
column. An even better program is Microsoft's own Depends.exe, from the
Platform SDK, which I highly recommend. When I wrote my Depends
program, I was completely unaware of the Microsoft version. Unlike my
version, the Microsoft program has a nice GUI and displays much more
information than just module dependencies.
If
you run Depends on spoolss.exe, you won't find MSDBI.DLL. That means
that the MSDBI.DLL was loaded via LoadLibrary, either directly or
indirectly. When I say directly, I mean that somebody explicitly called
LoadLibrary on the DLL in question. An indirect load means that
LoadLibrary was called for some other DLL, which in turn had an implicit
reference to the DLL in question.
With
a little thought, you can create numerous scenarios involving a mixture
of LoadLibrary calls and implicit references. Figuring out the exact
circumstances for a DLL being loaded can be a real nightmare. While a
dependency program can help with implicit references, determining DLLs
that were loaded via LoadLibrary is trickier. I use a system-level
debugger to set a breakpoint on the Load-Library entry point in
KERNEL32.DLL. (Under Windows 9x I'd use LoadLibraryA, and under
Windows NT, I'd use Load-LibraryExW.) When the breakpoint is hit, you
can look at the stack to find the parameter that points to the DLL name
being loaded.
Returning
to the example at hand, how does MSDBI.DLL get loaded into the spoolss
process? When it starts, the initialization code in spoolss calls
LoadLibrary to load WIN32SPL.DLL. WIN32SPL.DLL has an implicit reference
to LocalSpl.DLL. LocalSpl.DLL uses a single IMAGE-HLP.DLL function,
ImageNtHeader. This reference to an IMAGEHLP API is enough to bring in
IMAGEHLP.DLL, which in turn implicitly refers to MSDBI.DLL. Quite a
twisted path! If you experiment with ModuleList, you'll no doubt find
many other strange situations like this. Tracking down the dependencies
is a great way to bone up on the various system components and their
relationships.
Looking Forward
This issue marks my 60th consecutive monthly column for MSJ.
That's five straight years without missing a month (although I've come
close on a few occasions). When I first started out, this space was the
Windows Questions and Answers column. I covered 16-bit Windows-based
programming questions for nearly two years before switching the focus to
Win32 programming. In a recent column, I described issues for the
forthcoming 64-bit version of Windows NT. That's quite a leap in the
evolution of Windows that I've had the privilege to write about.
Under normal circumstances, 60 consecutive months would probably be an MSJ
record. However, that honor goes to Paul DiLascia, who started his
column before me and is still going strong. I've joked with Paul that
someday I'm going to catch up with his streak. However, that won't be
happening. For a variety of reasons (all positive), I'm going to cut my
column schedule down to once every three months.
Having
more time between columns should allow me to focus more of my time on
learning, and less on writing. However, in order to make the most of my
columns, I need your help. Keep sending me those column topic
suggestions. While I won't always be able to help every person with a
particular problem, I'm always trying to spot trends where an in-depth
Under the Hood column could help out. Thanks for reading!
Have a question about programming in Windows? Send it to Matt at mpietrek@tiac.com.
From the September 1998 issue of Microsoft Systems Journal.
|