Removing the Terminate­Thread from a DLL that needs to shut down a helper thread at unload

Date:September 7, 2018 / year-entry #203
Tags:code
Orig Link:https://blogs.msdn.microsoft.com/oldnewthing/20180907-00/?p=99675
Comments:    9
Summary:Designing it out again.

Once more we undertake the exercise of designing the Terminate­Thread out of some code.

The customer had an instrumentation toolchain. What you did was take your object code and sent it through an instrumentation tool, and that tool patched your object code so it could instrument things of interest. You then linked the modified object code with a special instrumentation library (statically-linked) to produce the final instrumented binary.

The static library created a worker thread, and they needed to shut down that worker thread. The object code was completely oblivious to the fact that somebody was trying to instrument it, so the static library had to somehow manage this worker thread without any help from the outside.

To clean up the worker thread, the instrumentation library used the atexit function to get a callback to run when the hosting DLL went through its DLL_PROCESS_DETACH. The developers didn't have way to get the worker thread to shut down cleanly, so they just called Terminate­Thread and crossed their fingers that it wouldn't come back to bite them.

Let's see if we can do better.

One idea is to put the worker thread in a helper DLL. The static library creates the thread on demand using the Free­Library­And­Exit­Thread technique to ensure that the worker thread maintains a reference to the host DLL. The atexit callback function calls a shutdown function in the helper DLL. Following the Free­Library­And­Exit­Thread technique, the shutdown function would signal the worker thread to exit and return immediately, allowing the worker thread to exit and free the library on its own.

There are some race conditions to be dealt with, such as the case where the host DLL is reloaded before the helper DLL's worker thread can exit. But these issues can be worked out.

The customer was reluctant to introduce a new DLL into the picture, however. For example, it means that the host's installer would have to carry the helper DLL when installing an instrumented version.

To avoid the helper DLL, the code could create a worker task in the thread pool with Create­Threadpool­Work, with an environment marked as Set­Threadpool­Callback­Runs­Long. Make that task do whatever the original thread was doing.

When it's time to shut down the worker thread, signal the worker task to exit using an event or some other private mechanism, and then call Wait­For­Threadpool­Work­Callbacks to wait for the exit to occur. Of course, you want to skip this if the entire process is shutting down.

This trick does assume that the worker task does not require any locks that might be held by the code running DLL_PROCESS_DETACH (most notably the loader lock).

The customer replied that they had found an even better third solution: They got rid of the worker thread entirely!

The purpose of the worker thread was to respond to requests for information from the instrumentation tool, and the customer realized that they could extract that information with careful use of Read­Process­Memory, so there was no need to have a thread dedicated to handing out that information.

(Normally, I wouldn't be a fan of using Read­Process­Memory as a mechanism for interprocess communication because it requires that the other process have PROCESS_VM_READ access to the process, which is a pretty large farm to be giving away, and it doesn't give you very useful granularity. But since this is an instrumentation tool, it's not unreasonable to require that the tool run in a security context that has full access to the process being instrumented.)


Comments (9)
  1. laonianren says:

    It wasn’t initially clear (to me anyway) that the code being instrumented is in a DLL.

    So maybe I’m missing the point. But adding a clean exit mechanism to the worker thread seems like less effort than adding a clean exit mechanism and moving to the thread pool. Or is there some problem with waiting for the worker thread to terminate in atexit?

    1. And how do you add that clean exit mechanism to the worker thread? The usual solution is FreeLibraryAndExitThread, but that also prevents the DLL from unloading.

      1. laonianren says:

        Sorry, I wasn’t being that sophisticated! I was thinking you could just signal the worker thread and wait for it to exit, but I was forgetting this would cause a loader lock deadlock.

        The TerminateThread solution is unfortunately recommended here (section headed “Thread Synchronization in DllMain for DLL_THREAD_DETACH during DLL Unload”):
        https://docs.microsoft.com/en-us/windows/desktop/Dlls/dynamic-link-library-best-practices

        It seems unhelpful to label this “best practice” when it’s actually “best practice for getting out of a hole you can usually avoid with better API design”. And if that’s not an option, your thread pool solution is better.

  2. Joshua says:

    I like shared memory for IPC in the same security context. To my mind ReadProcessMemory and WriteProcessMemory provide an implementation of shared memory.

  3. A finer-grained alternative to ReadProcessMemory may have been a named shared memory block – say, the instrumentation code at startup does something like CreateFileMapping(INVALID_HANDLE_VALUE, ..., ("ContosoInstrumentationTool" + std::to_string(GetCurrentProcessId())).c_str()), creating the memory area to use to store the information for the instrumentation tool. The instrumentation tool, in turn, given the process ID can open the corresponding memory block and read it as it pleases.

    As a bonus point, as long as the instrumentation tool keeps the file mapping open the data is kept alive for it to examine.

    The only drawback that comes to mind is that you may have problems if the shared memory is still alive and a new process with the same PID is started, but this can be worked around by adding an increasing suffix to the mapping name if the base name fails.

    1. Tanveer Badar says:

      Isn’t there a temporary involved whose c_str() you are calling?

    2. Maia Everett says:

      Nitpick, but in the code you provided, wouldn’t the anonymous std::string be immediately destroyed before the CreateFileMapping is called, thus invalidating the c_str pointer?

      1. The temporary is destroyed at the end of the full expression, which means after CreateFileMapping returns.

Comments are closed.


*DISCLAIMER: I DO NOT OWN THIS CONTENT. If you are the owner and would like it removed, please contact me. The content herein is an archived reproduction of entries from Raymond Chen's "Old New Thing" Blog (most recent link is here). It may have slight formatting modifications for consistency and to improve readability.

WHY DID I DUPLICATE THIS CONTENT HERE? Let me first say this site has never had anything to sell and has never shown ads of any kind. I have nothing monetarily to gain by duplicating content here. Because I had made my own local copy of this content throughout the years, for ease of using tools like grep, I decided to put it online after I discovered some of the original content previously and publicly available, had disappeared approximately early to mid 2019. At the same time, I present the content in an easily accessible theme-agnostic way.

The information provided by Raymond's blog is, for all practical purposes, more authoritative on Windows Development than Microsoft's own MSDN documentation and should be considered supplemental reading to that documentation. The wealth of missing details provided by this blog that Microsoft could not or did not document about Windows over the years is vital enough, many would agree an online "backup" of these details is a necessary endeavor. Specifics include:

<-- Back to Old New Thing Archive Index