Date: | March 29, 2012 / year-entry #78 |
Tags: | code |
Orig Link: | https://blogs.msdn.microsoft.com/oldnewthing/20120329-00/?p=7973 |
Comments: | 26 |
Summary: | A customer discovered a bug where terminating a process did not close the handles that the process had open, resulting in their emergency cleanup code not working: TerminateProcess(processHandle, EXITCODE_TERMINATED); DeleteFile(someFile); Their workaround was to insert a call to WaitForSingleObject(processHandle, 500) before deleting the file. The customer wanted to know whether they discovered a bug in... |
A customer discovered a bug where terminating a process did not close the handles that the process had open, resulting in their emergency cleanup code not working: TerminateProcess(processHandle, EXITCODE_TERMINATED); DeleteFile(someFile);
Their workaround was to insert a call to
As MSDN notes,
(Emphasis mine.) Termination is begun, but the function does not wait for termination to complete. Sometimes a thread gets stuck because a device driver has gotten wedged (or the driver doesn't support cancellation). To know when the handles are closed, wait on the process handle, because the process handle is not signaled until process termination is complete. If you are concerned that this can take too long, you can do like the customer suggested and wait with a timeout. Of course, if the timeout expires, then you have to decide what to do next. You can't delete the file, since it's still open, but maybe you can log an error diagnostic and let the user know why things are taking so long to clean up, and maybe add the file to a list of files to clean up the next time the program starts up. |
Comments (26)
Comments are closed. |
This is amazingly relevant to something I'm working on right now. Thanks!
Terminating the process that has a file open might require disk or network I/O, which takes a bit of time to run even if it's not stuck. Waiting on the process handle sounds like the right thing to do.
WaitForSingleObject(…INFINITE rather than 500 would be more appropriate as well.
There's another way. It's actually possible to write a safe version of the Unlocker program that Raymond rightly condemned, so long as the target process is not stuck in loader lock or kernel.
Using CreateRemoteThread, inject a thread that does this:
Suspend all threads but itself, keeping a list of previous thread priorities
Close the target handle.
Open a file handle to NUL.
While the handle just created is not the handle to close, push handle & duplicate.
Close all pushed handles.
Unsuspend all threads (use list saved in step 1).
2Joshua No need for CreateRemoteThread in this case – you may do eveything this just by DuplicateHandle from remote to you (DUPLICATE_CLOSE_SOURCE – effective remote CloseHandle) and back. BUT its not safe – application logic may critically relay on the state of opened file (size, content, it can be also used as IPC with some other process etc).
And dont forget about filemappings…
Joshua: In this case, it can be much simpler:
Use CreateRemoteThread to execute a routine that:
calls CloseHandle on all open handles and then
calls TerminateProcess.
Of course, if they control the source code to the target process, they could simply mark the temporary files as "delete on close" in the first place. Or at least open the file with FILE_SHARE_DELETE so that the cleanup process can delete it.
And dont forget about filemappings…
I did not. Those can crash the process if you free them.
2Joshua actually you may copy their contents to unnamed mappings and map new mappings exactly at same addresses as old ones. But of cause this will not be safe, but approx in the same 'degree of risk' as replacing file's handle.
In what way is your solution (closing the handle explicitly with a remote thread) faster than using TerminateProcess()? The CloseHandle operation will take the kernel the same time, no matter how it is invoked.
It's not faster at all. I'd rather not terminate a process that won't let go of a file I need it to, especially if it's a critical system process. Let's just say that killing the user's AntiVirus (most common offender for me) is not polite.
@Killer{R}: All cases I've seen of that were processes that had loaded DLLs that needed replacing. At that point, it's time to blast.
I think it'd be nice of the Windows File API's added overrides with timeout parameters. Have the timeouts be related to the handle being available. That way, in this case they wouldn't need to wait for the process to exit, they'd only need to wait for the file handle to become available. I would find it very useful.
[Not sure what you mean by "wait for the file handle to become available." -Raymond]
He wants the API to automatically wait a few seconds when encountering a locked file, like a semaphore.
There always will be people that want to specify timeout and other options of such implicit wait that will effectively force MS to re-invent all that WaitFor* functions that you may use now.
Also don't forget the virus scanners and similar tag alongs that like to keep files open after they have been closed by the initial process. The usual solution is to try several times in a loop with time delays.
In one of my test scripts I do that once, and on failure rename the files and continue trying to delete them in the background. Fortunately you can rename in use files.
Roger: If you can rename a file, can't you just as easily mark it for deletion?
@Gabe But if you rename a file you can immediately start creating a new file with the same name, which is sometimes useful. (e.g. a buggy program crashes without cleaning up its log file properly, and you need to immediately restart a new instance, using the same log filename.)
Maybe to some degree related to the above question, but how about deleting a file and subsequently creating a file with the exact name?
In our organization we're developing in .NET but we had to lookup the WIN32 documentation to find out that DeleteFile is not a "blocking" operation, but merely a request that will be handled later on. Is there a way to create a DeteleFileAndWaitForItToFinish function? Currently we're using a workaround that renames a file and then deleting it.
Background info: We stumbled upon this issue while running unit tests. A lot of tests set up a test environment. When a test is run it removes it environment (deletes files and folders), then the next test might set up a similar environment (creating files).
I never understood what EXITCODE_TERMINATED should be.
Are there any rules or best practices, negative/positive, small/big?
And if all that is done what's described above, it could still be possible that the file still cannot be deleted!
Chance is, the virus scanner just locks that file again (it's not so unlikely when it was just closed, as then the scanner will look into the new file to do its work). Sigh. Yes, it's not easy nowadays.
So there's an additional look necessary that just ends at a "I give up" timeout or when the file really disappeared.
@Freek: DeleteFile will finish before it returns unless some other handle is open on the file.
Joshua, can you please never, ever write any code again? It's just this sort of crap that makes software unreliable.
[Not sure what you mean by "wait for the file handle to become available." -Raymond]
There's an exclusive lock on the file handle. So just like waiting for any other kind of lock, I'd like that built into the file system API's. When I call CreateFile, I'd like it to have a timeout parameter for waiting for the file handle to become available. Since I don't have this, I currently have plenty of code which spins in loops for a period of time, trying to get the file handle, and bailing out eventually. I'd like that logic built into the OS so I can stop writing polling code.
@Joshua: I never thought I'd see the words "safe" and "CreateRemoteThread" in the same post. I'll take TerminateProcess over CreateRemoteThread any time — less chance of something going wrong. TerminateProcess will at least *consistently* hose things…
@Freek: consider the approach that the Visual Studio tests themselves use: create a new directory for every unit test run. This way, they don't overlap. Obviously this can use a considerable amount of disk space, but cleaning up old tests in the background is fairly easy (and doesn't require blocking waits). Also, you actually have the test results of previous runs for comparison, which can come in handy. As long as you've got the space, don't skimp on it. I'm assuming you can set up your code to use a particular directory — if not, you should either rewrite your code so it can (preferrably) or use a fixed drive letter and SUBST.
Many people will tell you that tests that create files are a bad idea to begin with, because they're too slow and not as easily run as tests that don't do I/O. I don't think this needs to be a religious issue, but it is true that tests that don't need explicit I/O are preferable. Of course, if the I/O itself is what you're testing, it's a necessity.
@Joshua: "Suspending all threads except itself" is safe? Really? If "itself" hangs for any reason, or crashes, all threads are suspended. Yuck.
"to find out that DeleteFile is not a "blocking" operation, but merely a request that will be handled later on"
Yes, it does not work like "unlink" in Unix (which never fails and is always synchronous if you can modify the directory at all), but if all your test cases have closed their file handles, the delete *will* happen synchronously. If you have a problem here then check your code for files that are still open.
This gets into one of my major pet peeves; developers who use sleeps instead of waiting on handles. File and thread handles go signaled when exited. Having threads polling a bool to indicate exit is another one that drives me crazy. (Then there are those who check for WAIT_ABANDONED on events or semaphores.)
If you can open a file in read only mode, another trick is to open it with DELETE_ON_CLOSE flag.