Removing the TerminateThread from a DLL that needs to shut down a helper thread at unload

Once more we undertake the exercise of designing the TerminateThread out of some code.

The customer had an instrumentation toolchain. What you did was take your object code and sent it through an instrumentation tool, and that tool patched your object code so it could instrument things of interest. You then linked the modified object code with a special instrumentation library (statically-linked) to produce the final instrumented binary.

The static library created a worker thread, and they needed to shut down that worker thread. The object code was completely oblivious to the fact that somebody was trying to instrument it, so the static library had to somehow manage this worker thread without any help from the outside.

To clean up the worker thread, the instrumentation library used the atexit function to get a callback to run when the hosting DLL went through its DLL_PROCESS_DETACH. The developers didn't have way to get the worker thread to shut down cleanly, so they just called TerminateThread and crossed their fingers that it wouldn't come back to bite them.

Let's see if we can do better.

One idea is to put the worker thread in a helper DLL. The static library creates the thread on demand using the FreeLibraryAndExitThread technique to ensure that the worker thread maintains a reference to the host DLL. The atexit callback function calls a shutdown function in the helper DLL. Following the FreeLibraryAndExitThread technique, the shutdown function would signal the worker thread to exit and return immediately, allowing the worker thread to exit and free the library on its own.

There are some race conditions to be dealt with, such as the case where the host DLL is reloaded before the helper DLL's worker thread can exit. But these issues can be worked out.

The customer was reluctant to introduce a new DLL into the picture, however. For example, it means that the host's installer would have to carry the helper DLL when installing an instrumented version.

To avoid the helper DLL, the code could create a worker task in the thread pool with CreateThreadpoolWork, with an environment marked as SetThreadpoolCallbackRunsLong. Make that task do whatever the original thread was doing.

When it's time to shut down the worker thread, signal the worker task to exit using an event or some other private mechanism, and then call WaitForThreadpoolWorkCallbacks to wait for the exit to occur. Of course, you want to skip this if the entire process is shutting down.

This trick does assume that the worker task does not require any locks that might be held by the code running DLL_PROCESS_DETACH (most notably the loader lock).

The customer replied that they had found an even better third solution: They got rid of the worker thread entirely!

The purpose of the worker thread was to respond to requests for information from the instrumentation tool, and the customer realized that they could extract that information with careful use of ReadProcessMemory, so there was no need to have a thread dedicated to handing out that information.

(Normally, I wouldn't be a fan of using ReadProcessMemory as a mechanism for interprocess communication because it requires that the other process have PROCESS_VM_READ access to the process, which is a pretty large farm to be giving away, and it doesn't give you very useful granularity. But since this is an instrumentation tool, it's not unreasonable to require that the tool run in a security context that has full access to the process being instrumented.)

laonianren says:

September 7, 2018 at 7:37 am

It wasn’t initially clear (to me anyway) that the code being instrumented is in a DLL.

So maybe I’m missing the point. But adding a clean exit mechanism to the worker thread seems like less effort than adding a clean exit mechanism and moving to the thread pool. Or is there some problem with waiting for the worker thread to terminate in atexit?

Raymond Chen - MSFT says:

September 9, 2018 at 10:12 am

And how do you add that clean exit mechanism to the worker thread? The usual solution is FreeLibraryAndExitThread, but that also prevents the DLL from unloading.
1. laonianren says:
  
  September 10, 2018 at 4:15 pm
  
  Sorry, I wasn’t being that sophisticated! I was thinking you could just signal the worker thread and wait for it to exit, but I was forgetting this would cause a loader lock deadlock.
  
  The TerminateThread solution is unfortunately recommended here (section headed “Thread Synchronization in DllMain for DLL_THREAD_DETACH during DLL Unload”):
  https://docs.microsoft.com/en-us/windows/desktop/Dlls/dynamic-link-library-best-practices
  
  It seems unhelpful to label this “best practice” when it’s actually “best practice for getting out of a hole you can usually avoid with better API design”. And if that’s not an option, your thread pool solution is better.
  1. Raymond Chen - MSFT says:
    
    September 10, 2018 at 4:52 pm
    
    Yes, I’ve complained about that document before.

Joshua says:

September 7, 2018 at 9:10 am

I like shared memory for IPC in the same security context. To my mind ReadProcessMemory and WriteProcessMemory provide an implementation of shared memory.

Matteo Italia says:

September 7, 2018 at 2:03 pm

A finer-grained alternative to ReadProcessMemory may have been a named shared memory block – say, the instrumentation code at startup does something like CreateFileMapping(INVALID_HANDLE_VALUE, ..., ("ContosoInstrumentationTool" + std::to_string(GetCurrentProcessId())).c_str()), creating the memory area to use to store the information for the instrumentation tool. The instrumentation tool, in turn, given the process ID can open the corresponding memory block and read it as it pleases.

As a bonus point, as long as the instrumentation tool keeps the file mapping open the data is kept alive for it to examine.

The only drawback that comes to mind is that you may have problems if the shared memory is still alive and a new process with the same PID is started, but this can be worked around by adding an increasing suffix to the mapping name if the base name fails.

Tanveer Badar says:

September 7, 2018 at 9:51 pm

Isn’t there a temporary involved whose c_str() you are calling?
Maia Everett says:

September 8, 2018 at 2:43 am

Nitpick, but in the code you provided, wouldn’t the anonymous std::string be immediately destroyed before the CreateFileMapping is called, thus invalidating the c_str pointer?
1. Raymond Chen - MSFT says:
  
  September 9, 2018 at 2:18 pm
  
  The temporary is destroyed at the end of the full expression, which means after CreateFileMapping returns.

Date:	September 7, 2018 / year-entry #203
Tags:	code
Orig Link:	https://blogs.msdn.microsoft.com/oldnewthing/20180907-00/?p=99675
Comments:	9
Summary:	Designing it out again.

Removing the Terminate­Thread from a DLL that needs to shut down a helper thread at unload

Removing the `TerminateThread` from a DLL that needs to shut down a helper thread at unload