Identifying an object whose underlying DLL has been unloaded

Date:April 25, 2007 / year-entry #144
Tags:other
Orig Link:https://blogs.msdn.microsoft.com/oldnewthing/20070425-00/?p=27123
Comments:    10
Summary:Okay, so I gave it away in the title, but follow along anyway. Your program chugs along and then suddenly it crashes like this: eax=06bad8e8 ebx=00000000 ecx=1e1cfdf0 edx=00000000 esi=06b9a680 edi=01812950 eip=1180ab57 esp=001178b4 ebp=001178c0 iopl=0 nv up ei pl nz na pe nc cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010206 ABC!FunctionX+0x1f: 1180ab57 ff5108 call dword ptr...

Okay, so I gave it away in the title, but follow along anyway.

Your program chugs along and then suddenly it crashes like this:

eax=06bad8e8 ebx=00000000 ecx=1e1cfdf0 edx=00000000 esi=06b9a680 edi=01812950
eip=1180ab57 esp=001178b4 ebp=001178c0 iopl=0         nv up ei pl nz na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010206
ABC!FunctionX+0x1f:
1180ab57 ff5108          call    dword ptr [ecx+8]    ds:0023:1e1cfdf8=????????
0:000>>

Instantly you recognize the following:

  • This is a virtual method call. (Call indirect through register plus offset.) — Very high confidence.

  • The vtable is in ecx. (That is the base register of the indirect call.) — Very high confidence.

  • The underlying DLL for this object has been unloaded. (The memory that contains the vtable is not valid and its address is consistent with once having been in valid code.) — High confidence.

  • This is a IUnknown::Release call. (Release is the third function of IUnknown and therefore resides at offset 8 on x86.) — High confidence.

Of course, all of the above "instant conclusions" are merely "highly-educated guesses", but life is full of highly-educated guesses. (Every morning, I guess that my plates are still in the cupboard.)

Let's run with our theory that the object was in an unloaded DLL and look for confirmation.

0:000> lm
start    end        module name
...
Unloaded modules:
10340000 10348000   DEF.DLL
1e1c0000 1e781000   GHI.DLL
25a90000 25a96000   JKL.DLL
0:000>

Aha, our presumed vtable address lies right inside the address space where GHI.DLL used to be loaded. Let's see what used to be loaded at that address. For this, I borrow a trick from Doron, namely loading a module as a dump file. This "virtually loads" the library so you can poke around inside it.

C:\Program Files\ABC> ntsd -z GHI.DLL

Microsoft (R) Windows Debugger
Copyright (c) Microsoft Corporation. All rights reserved.

Loading Dump File [C:\Program Files\ABC\GHI.DLL]
...
ModLoad: 15800000 15dc1000   C:\Program Files\ABC\GHI.DLL
eax=00000000 ebx=00000000 ecx=00000000 edx=00000000 esi=00000000 edi=00000000
eip=15807366 esp=00000000 ebp=00000000 iopl=0         nv up di pl nz na pe nc
cs=0000  ss=0000  ds=0000  es=0000  fs=0000  gs=0000             efl=00000000
GHI!_DllMainCRTStartup:
15807366 8bff             mov     edi,edi
0:000>

That module-load notification tells you where the DLL got virtually-loaded; in our case, it got loaded to 0x15800000. This isn't the same address as it was in our crashed process, so we'll have to do some mental arithmetic to account for the discrepancy.

Going back to the original register dump, we see that our putative vtable is at ecx=1e1cfdf0 relative to the load address 1e1c0000. Since our DLL-loaded-as-a-dump-file was loaded at 0x1580000 we need to adjust the address to be relative to the new location.

// working with the second copy of ntsd
0:000> ln 0x1580fdf0
(1580fdf0)   GHI!CAlphaStream::`vftable'

That magic number 0x1580fdf0 is just the result of some mental arithmetic. First:

0x1e1cfdf0
-0x1e1c0000
0x0000fdf0

This is the address of the vtable in the crashed process relative to the load address of the DLL in the crashed process. Next:

0x15800000
+0x0000fdf0
0x1580fdf0

This is the address of the vtable in the DLL-loaded-as-a-dump-file relative to the load address of the DLL in the DLL-loaded-as-a-dump-file. The math really isn't that hard, as you can see, since a lot of things cancel out. This happens a lot.

When we asked the debugger to tell us what symbol is nearest to that address, we hit the jackpot: It is exactly a vtable for the CAlphaStream object. This confirms our original theory. We can even confirm the IUnknown::Release theory by dumping the vtable.

0:000> dds 1580fdf0
1580fdf0  159234b3 GHI!CAlphaStream::QueryInterface
1580fdf4  15810539 GHI!CBetaState::AddRef
1580fdf8  15923cfc GHI!CAlphaStream::Release
1580fdfc  15923d30 GHI!CAlphaStream::Read
...

Yup, that's a CAlphaStream vtable all right.

Since I'm not familiar with the GHI.DLL file, let's ask the debugger where the source code is so we can take a closer look:

0:000> .lines
Line number information will be loaded
0:000> dds 1580fdf0
1580fdf0  159234b3 GHI!CAlphaStream::QueryInterface
                   [c:\dev\fabricam\synergy\proactive\winwin.cpp @ 2624]
1580fdf4  15810539 GHI!CBetaState::AddRef
                   [c:\dev\fabricam\leverage\paradigm\initiative.cpp @ 427]
1580fdf8  15923cfc GHI!CAlphaStream::Release
                   [c:\dev\fabricam\synergy\proactive\winwin.cpp @ 2638]
1580fdfc  15923d30 GHI!CAlphaStream::Read
                   [c:\dev\fabricam\synergy\proactive\winwin.cpp @ 2649]

Now that we know where the source code to CAlphaStream is, we can hop on over to take a quick peek and confirm that, oh look, the object doesn't increment the DLL object count when it is constructed (or decrement it when it is destructed). As a result, when COM calls DllCanUnloadNow, the GHI.DLL says, "Sure, go ahead!" The DLL is unloaded even though ABC still has a reference to it, and then when ABC goes to release that reference, we crash because GHI is already gone.

After I wrote this up, I discovered that Tony Schreiner went through pretty much the same exercise with a third-party Internet Explorer toolbar, except he had the extra bonus challenge of not having source code for the plug-in!


Comments (10)
  1. richard says:

    Are you running OS/2?

    My copy of ntsd says that the -z option is "reserved for OS/2 debugging".

  2. Rhomboid says:

    Rather than doing the address calculation manually, couldn’t you run rebase on a copy of GHI.DLL so that it loads at the same place it did in the app?

  3. James Schend says:

    Richard…

    I checked the version of NTSD that ships with Windows XP SP2. -z is documented as:

    -z <CrashDmpFile> specifies the name of a crash dump file to debug

    -zp <CrashPageFile> specifies the name of a page.dmp file

    Are you using the Windows 2000 version? If it was reserved for OS/2, they might have recycled it for XP.

  4. richard says:

    Yes, I am using Windows 2000. I realized after I posted that I had omitted some vital information (such as the version of the OS and tools I was running).

  5. Mark says:

    Often seen in C++ apps when you have a global COM smart pointer, so the destructor tries calling Release on the COM object after main has completed, and CoUninitialize has been called.

  6. Pavel Lebedinsky says:

    I think you can also make debugger believe that GHI.DLL is still loaded by doing this:

    0:000> ? 1e781000 – 1e1c0000

    Evaluate expression: 6033408 = 005c1000

    0:000> .reload GHI.DLL=1e1c0000,005c1000

    This is convenient when you need to translate multiple addresses (such as when there is a stack trace with several return addresses from the unloaded DLL on the stack).

    (Richard/James – ntsd from system32 is very old and doesn’t have any extensions. You should download the latest version from http://www.microsoft.com/whdc/DevTools/Debugging/default.mspx).

  7. JS says:

    Use:

    .reload /unl ghi.dll

    It does all the dirty work for you.

  8. Kujo says:

    In this crash, my first thought is usually that the “this” pointer was bad, so the vtable in ecx isn’t a valid one (using a dangling pointer?)  I probably would have barked up that tree before considering an unloaded dll.

    I’ve noticed that good windbg debugging involves a lot of pattern recognition (being able to spot a float or a string, for example.) Unlike me, it sounds like unloaded dll was your first guess, and with high confidence no less! The heuristic you listed was “its address is consistent with once having been in valid code.”  How did you discern that instantly?  

    I can see that 0x1e1cfdf8 is well-aligned, but that’s hardly telling on its own.  0x10000000 is a very common dll base address, but 0x1e1cfdf8 doesn’t feel close enough to that (and indeed, the lucky dll was based at 0x1e1c0000.) Is it just that I don’t work with COM very often, so dlls aren’t my first guess?

    [True, if I had thought harder, 0x1Exxxxxx does seem a bit too high, but sometimes you get the right answer for the wrong reason. -Raymond]
  9. Eric C Brown says:

    In recent versions of windbg, the register display will often say <unloaded ghi.dll>+blah.  Not always, but often.

  10. Kujo says:

    Thanks for the tip on -z, it’s definitely nicer than using dumpbin or something.  

    Inspired by Eric’s comment, I see there’s a brand new version of the debugging tools released today! Thanks :)

Comments are closed.


*DISCLAIMER: I DO NOT OWN THIS CONTENT. If you are the owner and would like it removed, please contact me. The content herein is an archived reproduction of entries from Raymond Chen's "Old New Thing" Blog (most recent link is here). It may have slight formatting modifications for consistency and to improve readability.

WHY DID I DUPLICATE THIS CONTENT HERE? Let me first say this site has never had anything to sell and has never shown ads of any kind. I have nothing monetarily to gain by duplicating content here. Because I had made my own local copy of this content throughout the years, for ease of using tools like grep, I decided to put it online after I discovered some of the original content previously and publicly available, had disappeared approximately early to mid 2019. At the same time, I present the content in an easily accessible theme-agnostic way.

The information provided by Raymond's blog is, for all practical purposes, more authoritative on Windows Development than Microsoft's own MSDN documentation and should be considered supplemental reading to that documentation. The wealth of missing details provided by this blog that Microsoft could not or did not document about Windows over the years is vital enough, many would agree an online "backup" of these details is a necessary endeavor. Specifics include:

<-- Back to Old New Thing Archive Index