aving
worked with Win32® for over eight years now, I've built up quite a list
of features (at the API level) that I'd like to see. These are mostly
features that would make my programming life easier, as well as making
it easier to write cool tools. When I installed the Windows XP beta
(formerly known by the codename "Whistler"), I wasn't expecting to see
many new APIs, so I was pleasantly surprised to find that I was wrong!
This month I'm going to describe one of these additions, something known
as vectored exception handling. I know, it's not some nifty Microsoft®
.NET gizmo, but honestly, exciting new features are still being added to
plain old Windows! I stumbled across vectored exception handling by running the PEDIFF program from my November 1997 MSJ
column. You tell PEDIFF the paths to two different copies of a DLL, and
PEDIFF returns a list of all the exported APIs that are different
between the two DLLs. In this case, I discovered vectored exception
handling by comparing KERNEL32.DLL from Windows 2000 with the Windows XP
version. There were more than a few new APIs added in KERNEL32 for
Windows XP, but AddVectoredExceptionHandler jumped out at me
immediately. As an added bonus, this API was documented in the latest
MSDN® Library, so I didn't have to hunt for info. Note
that due to an issue in the Beta 2 version of WINBASE.H, you'll need to
install the RC1 release of the Platform SDK to compile the code
described in this column.
A Quick Review of Structured Exception Handling So
what exactly is vectored exception handling, and why should you care?
For starters, it's helpful to quickly review regular exception handling,
so that you can see how vectored exception handling is different.
Assuming you work in a language like C++ that supports exceptions,
you're probably aware of Win32 structured exception handling (SEH).
Structured exception handling is done in C++ using try/catch statements,
or by the Microsoft C++ compiler's __try/__except extensions. For a
deep drilldown on how SEH works, see my article "A Crash Course in Structured Exception Handling" from the January 1997 issue of MSJ. In
brief, structured exception handling uses stack-based exception nodes.
When you use a try block, information about the exception handler is
stored in the current procedure's stack frame. On the x86 architecture,
Microsoft uses a pointer value stored at FS:[0] to point to the current
exception handler frame. The frame information includes a code address
to call when an exception occurs. If
you call another function inside of a try block, the new function may
set up its own exception handler. When this happens, a new exception
handler frame is created on the stack and a pointer to the previous
handler's frame is established, as shown in Figure 1. In
essence, the SEH frames form a linked list, with the head of the list
pointed to by FS:[0]. It's critical to note here that each successive
node must be higher on the thread's stack. The operating system enforces
this particular rule, meaning that you can't just arbitrarily make your
own handler frame and insert it into the list.
 Figure 1 Exception Handlers in the Stack
The
fact that the frames are kept in a linked list isn't just a minor
detail in the grand scheme of things, it's a vital part of how SEH
works. When an exception occurs, the system starts at the head of the
list, and invokes the exception handler with a code that says "This
exception occurred. Do you want to handle it?" The exception handler may
handle the exception by fixing the problem and returning
EXCEPTION_CONTINUE_EXECUTION. An
exception handler can also choose to decline this special, limited-time
offer by returning EXCEPTION_CONTINUE_SEARCH. When this happens, the
system moves to the next node in the linked list, and asks the same
question. This sequence continues until a handler chooses to handle the
exception, or the end of the list is reached. I've drastically
simplified the details of SEH here, but it's sufficient for our
purposes. What are the
ramifications of the SEH design? The important thing is that a given
handler can choose what to do with an exception without regard to what
any previously installed handlers (which come later in the list) might
want to do with it. Sometimes this can be a major pain. The following
example shows why. Let's say that
you've written the world's coolest exception handler. When something
bad happens, your handler diagnoses the problem, logs relevant details,
solves world hunger, and cancels the mind-numbing weekly staff meeting.
Furthermore, you put your handler inside your main (or WinMain)
function, so that your entire program is covered. Now,
at some point you call an external component, over which you have no
control. That component also installs an exception handler, and a wimpy
one at that. At the first sign of an exception, it turns tail and exits
the program. Your handler never gets the chance to execute because this
other handler appeared first in the linked list of exception handlers.
In short, the coolness of SEH is tempered by the fact that exception
handlers are only effective if somebody deeper in the call chain hasn't
installed one of their own. Allow
me to throw one more bit of SEH trivia at you before moving on to
vectored exception handling. When a program is being debugged and an
exception occurs, a few more steps transpire. First, the debugger is
given a first chance to handle the exception, or allow the child process
to see it. If the child process sees the exception, the steps outlined
previously are followed. If no handler in the child process steps
forward to handle the exception, the debugger receives a second chance
to handle the notification. (This is normally when a debugger pops up an
unhandled exception dialog.) At this point, the process is as good as
dead.
Introducing Vectored Exception Handling In
a nutshell, vectored exception handling is similar to regular SEH, with
three key differences: - Handlers aren't tied to a specific function nor are they tied to a stack frame.
- The compiler doesn't have keywords (such as try or catch) to add a new handler to the list of handlers.
- Vectored exception handlers are explicitly added by your code, rather than as a byproduct of try/catch statements.
The
new AddVectoredExceptionHandler API takes a function pointer parameter
and adds the function's address to a linked list of registered handlers.
Because the system uses a linked list to store the vectored exception
handlers, a program can install as many vectored handlers as it wants. How
does vectored exception handling coexist with structured exception
handling? When an exception occurs in Windows XP, the vectored exception
handler list is processed before the normal SEH list. This works out
well for compatibility with existing code. If the vectored exception
list were to be processed after the SEH list, an SEH handler might
handle the exception, and the vectored exception handlers wouldn't get a
chance to see it. With regard to
debugging, vectored exception handling works like structured exception
handling. That is, when a program is being debugged, the debugger still
sees the first chance exception before the target process does. Only
when the debugger chooses to pass the exception on to the child process
(which is typically the case), do the vectored exception handlers get
invoked. The AddVectoredExceptionHandler is declared in WINBASE.H:
WINBASEAPI PVOID WINAPI AddVectoredExceptionHandler(
ULONG FirstHandler,
PVECTORED_EXCEPTION_HANDLER VectoredHandler );
The first parameter of the function tells the system
whether the handler should be placed at the very head of the linked list
of handlers, or at the very end. The handler list is not tied to any
thread, and is global to the process. Thus, while you can request to be
put at the head of the list of handlers to be called, you're not
guaranteed to be the first one called. You won't be first if some other
piece of code called AddVectoredExceptionHandler after you, and also
requested to be the first handler. Whenever AddVectoredExceptionHandler
is called, the new handler is always placed at the very head, or very
last position in the list at that moment. The
second parameter is the address of the exception handler function. It's
prototyped like this:
LONG NTAPI VectoredExceptionHandler(PEXCEPTION_POINTERS);
The PEXCEPTION_POINTERS parameter is a pointer that gives
the function everything it could want to know about the exception,
including the exception type, address, and register values. The function
is expected to return either EXCEPTION_CONTINUE_SEARCH or
EXCEPTION_CONTINUE_EXECUTION. When
EXCEPTION_CONTINUE_EXECUTION is returned, the system attempts to
restart execution of the process. Vectored exception handlers that
appear later in the list won't be called, nor will any of the structured
exception handlers. When the function returns
EXCEPTION_CONTINUE_SEARCH, the system moves on to the next vectored
exception handler. After all vectored exception handlers have been
called, the system starts with the structured exception handling list. In
addition to the AddVectoredExceptionHandler API, there's also a
RemoveVectoredExceptionHandler API, which removes a previously installed
handler from the list. It's not terribly interesting, but I am
mentioning it here for completeness. The
ability to preempt the normal SEH processing is something that various
system-level programmers have wanted for a long time. However, with this
flexibility comes the responsibility to use vectored exception handling
properly. A vectored exception handler has the ability to return
EXCEPTION_CONTINUE_EXECUTION, which causes subsequent handlers in the
list not to be called. Somebody else's code may be expecting to see
certain exceptions, and if you don't properly pass them along, you'll
introduce bugs. Microsoft has introduced a great new capability here, so
let's not mess it up for everybody else by carelessly assuming that
your vectored exception handler is the only one registered. Showing Off Vectored Exception Handling For
people writing tracing and diagnostic tools, breakpoints are a textbook
way to get control when a desired section of code executes.
Unfortunately, using breakpoints means handling exceptions, in
particular, the breakpoint and single-step exceptions. It's not really
feasible to use structured exception handling to see these exceptions,
since you can never be sure that your handler will always see them. Some
tools (such as Mutek's BugTrapper) have circumvented this problem by
overwriting parts of the user mode exception handling code in NTDLL. One
place to do this would be the KiUserExceptionDispatcher function in
NTDLL.DLL, which I described in the aforementioned structured exception
handling article in MSJ. While overwriting
KiUserExceptionDispatcher works, it's a fragile solution, and prone to
breaking as new versions of NTDLL come out. With
vectored exception handling, there's no need to do these awful hacks.
VectoredExceptionHandling is a clean, easily extensible way to see all
exceptions, assuming all handlers play nicely, as I described earlier.
To demonstrate vectored exception handling, I created a small project
that uses breakpoints to monitor when a program calls LoadLibrary. In
this program, whenever LoadLibrary is called, my code prints out the
name of the DLL being loaded. Advanced
readers may be wondering about Import Address Table (IAT) patching, and
if it could do the same thing as my breakpoint-based approach. While
you certainly could use IAT patching for this particular scenario,
there's a lot more code involved. You'd be responsible for hooking the
IAT of all DLLs, including those that are loaded dynamically via
LoadLibrary. Trust me, this is harder than it might appear at first.
Using a breakpoint is a much simpler approach, all things considered. A
second problem with IAT patching is that it only works for exported
functions. The breakpoint technique will work for any code address, not
just exported functions. Thus, it would be useful for things like
hooking all calls to malloc when using the static runtime library (as
opposed to MSVCRT.DLL). Figure 2
contains the code for a DLL that uses vectored exception handling to
monitor LoadLibrary calls. Each time LoadLibrary is called,
VectoredExcBP writes the name of the DLL to stdout. The DLL is
self-contained, and doesn't require any special initialization calls.
Just call and link against its single exported function to experiment
with it. I also wrote TestVE (Figure 3)
as a demo program to call LoadLibrary on a couple of interesting DLLs.
TestVE links against a dummy function in VectoredExcBP.DLL, which forces
it to be loaded at program initialization time. When
VectoredExcBP loads, its DllMain function calls my
SetupLoadLibraryExWCallback function. This function uses the new
AddVectoredExceptionHandler API to register a handler. In addition, the
function locates the address of LoadLibraryExW in KERNEL32.DLL, and sets
a breakpoint at its first instruction. The
meat of the VectoredExcBP code is in the LoadLibraryBreakpointHandler
function. This is the handler address passed to
AddVectoredExceptionHandler. When an exception occurs, this function
gets control. The code is looking for two specific exceptions. For any
exception that it's not interested in, the function returns an
EXCEPTION_CONTINUE_SEARCH code to let other handlers have a crack at it. Without
getting too much into debugger theory, let me quickly describe the
sequence of events when a breakpoint is hit and the program resumes.
When the CPU executes the breakpoint instruction, the first thing that
happens is an exception of type STATUS_BREAKPOINT. When this occurs, no
code in the target function has executed yet. Now is a perfect time to
examine parameters and so on. Because
the breakpoint has overwritten the original instruction, the next step
is to restore things so that the original instruction can execute.
Ordinarily, this is not a big deal. However, there's a problem in this
case. If you just restore the original instruction and resume execution,
your breakpoint is no longer there, and you'll miss future passes
through the target function. The
solution (at least on x86 processors) is to have the CPU single-step
just the one instruction, and give control back to you so that you can
reinsert the breakpoint. Single-stepping on an x86 processor is a matter
of setting the trace flag (value 0x100) in the CPU's EFlags register.
When the trace flag is set, the CPU executes just one instruction, then
generates a STATUS_SINGLE_STEP exception. After receiving the
STATUS_SINGLE_STEP exception, the trace flag can be turned off to resume
normal execution. A close
examination of the LoadLibraryBreakpointHandler shows that it implements
exactly the breakpoint stepping algorithm just described. The code is
extra-paranoid, and checks that the exception addresses are the ones
it's expecting. There's not much more to it than what I've described,
and the code is commented extensively. Inside
the STATUS_BREAKPOINT case code, LoadLibraryBreakpointHandler calls out
to a function I named BreakpointCallback. The BreakpointCallback
function uses the value of the stack pointer at the time of the
exception to locate the parameter values. In the case of LoadLibrary,
there's just a single parameter, a pointer to the name of the DLL to
load. The BreakpointCallback function retrieves this pointer value off
the stack and printf's it. (You might want to change the printf call to
something like an OutputDebugString if you want to use this DLL on a
non-console mode application.) You
may be wondering why I chose to monitor the LoadLibraryExW function.
There's a good reason! Because LoadLibrary takes a string parameter,
there are both ANSI and Unicode versions of it. The most commonly used
form of LoadLibrary is LoadLibraryA. It turns out that LoadLibraryA is
just a wrapper around LoadLibraryExA. In turn, LoadLibraryExA is just a
wrapper around LoadLibraryExW. Likewise, the LoadLibraryW API just wraps
a LoadLibraryExW call. All roads lead to LoadLibraryExW. With a single
breakpoint on this API, I'm actually seeing all calls to any of the
LoadLibrary variants. To try out
VectoredExcBP, make sure you're running Windows XP Beta 2 or later, and
run the TestVE program. TestVE itself only calls LoadLibrary on two DLLs
(MFC42.DLL and WININET.DLL.) However, these DLLs call LoadLibrary
inside their DLL main, so you should see additional calls to
LoadLibrary. If everything is working, you should see the following
output:
LoadLibrary called on: MFC42
LoadLibrary called on: MSVCRT.DLL
LoadLibrary called on: G:\WINDOWS\System32\MFC42LOC.DLL
LoadLibrary called on: WININET
LoadLibrary called on: kernel32.dll
LoadLibrary called on: advapi32.dll
LoadLibrary called on: kernel32.dll
Implementation of Vectored Exception Handling The
implementation of vectored exception handling in Windows XP Beta 2 is
remarkably straightforward. While the AddVectoredExceptionHandler API
ostensibly appears in KERNEL32.DLL, it's really just forwarded to the
RtlAddVectoredExceptionHandler function in NTDLL. Figure 4 shows pseudocode for the implementation of RtlAddVectoredExceptionHandler. The
vectored exception handler list is stored as a circular linked list.
Each registered exception handler is represented by a 12-byte node
allocated from the process heap. A critical section guards the code that
actually inserts the handler at the head or tail of the list. If the
FirstHandler parameter is nonzero, the new handler node is inserted at
the head of the list, otherwise the new node goes at the tail. Pretty
simple stuff! There's no code that checks to see if a previously
installed handler address is being registered again, so it's possible
for the same handler address to be registered (and called) more than
once. The other noteworthy part
of the vectored exception handling implementation is how the handlers
are invoked. As I described in my SEH article, KiUserExceptionDispatcher
(in NTDLL) calls RtlDispatchException. Figure 5
shows how vectored exception handling has been added to the
RtlDispatchException code in NTDLL. If you compare it to the original
code from my earlier article, you'll see that it's just the addition of a
single function call (RtlCallVectoredExceptionHandlers) at the
beginning of RtlDispatchException. This proves that vectored exception
handlers are called before structured exception handlers. The pseudocode for RtlCallVectoredExceptionHandlers can be found in Figure 6.
Again, the code is very straightforward. A critical section guards a
while loop. As it iterates through each registered handler, the loop
calls the handler function. If the handler function returns
EXCEPTION_CONTINUE_EXECUTION, the loop exits without calling subsequent
handlers. The function takes care to return a value indicating whether
RtlDispatchException should look for structured exception handlers. As
you can probably guess, I consider vectored exception handling to be a
very significant addition to Windows XP. I only wish this capability had
been in Win32 all along. I've demonstrated one big advantage of using
vectored exception handling, and hopefully there will be more innovative
uses for it in the coming years.
Send questions and comments for Matt to hood@microsoft.com.
|