Date: | April 20, 2004 / year-entry #150 |
Tags: | history |
Orig Link: | https://blogs.msdn.microsoft.com/oldnewthing/20040420-00/?p=39723 |
Comments: | 48 |
Summary: | Windows lets you hibernate the entire machine, but why can't it hibernate just one process? Record the state of the process and then resume it later. Because there is state in the system that is not part of the process. For example, suppose your program has taken a mutex, and then it gets process-hibernated. Oops,... |
Windows lets you hibernate the entire machine, but why can't it hibernate just one process? Record the state of the process and then resume it later. Because there is state in the system that is not part of the process. For example, suppose your program has taken a mutex, and then it gets process-hibernated. Oops, now that mutex is abandoned and is now up for grabs. If that mutex was protecting some state, then when the process is resumed from hibernation, it thinks it still owns the mutex and the state should therefore be safe from tampering, only to find that it doesn't own the mutex any more and its state is corrupted. Imagine all the code that does something like this: // assume hmtx is a mutex handle that // protects some shared object G WaitForSingleObject(hmtx, INFINITE); // do stuff with G ... // do more stuff with G on the assumption that // G hasn't changed. ReleaseMutex(hmtx); Nobody expects that the mutex could secretly get released during the "..." (which is what would happen if the process got hibernated). That goes against everything mutexes stand for!
Consider, as another example, the case where you have a file that was opened for exclusive access. The program will happily run on the assumption that nobody can modify the file except that program. But if you process-hibernate it, then some other process can now open the file (the exclusive owner is no longer around), tamper with it, then resume the original program. The original program on resumption will see a tampered-with file and may crash or (worse) be tricked into a security vulnerability. One alternative would be to keep all objects that belong to a process-hibernated program still open. Then you would have the problem of a file that can't be deleted because it is being held open by a program that isn't even running! (And indeed, for the resumption to be successful across a reboot, the file would have to be re-opened upon reboot. So now you have a file that can't be deleted even after a reboot because it's being held open by a program that isn't running. Think of the amazing denial-of-service you could launch against somebody: Create and hold open a 20GB file, then hibernate the process and then delete the hibernation file. Ha-ha, you just created a permanently undeletable 20GB file.) Now what if the hibernated program had created windows. Should the window handles still be valid while the program is hibernated? What happens if you send it a message? If the window handles should not remain valid, then what happens to broadcast messages? Are they "saved somewhere" to be replayed when the program is resumed? (And what if the broadcast message was something like "I am about to remove this USB hard drive, here is your last chance to flush your data"? The hibernated program wouldn't get a chance to flush its data. Result: Corrupted USB hard drive.) And imagine the havoc if you could take the hibernated process and copy it to another machine, and then attempt to restore it there. If you want some sort of "checkpoint / fast restore" functionality in your program, you'll have to write it yourself. Then you will have to deal explicitly with issues like the above. ("I want to open this file, but somebody deleted it in the meantime. What should I do?" Or "Okay, I'm about to create a checkpoint, I'd better purge all my buffers and mark all my cached data as invalid because the thing I'm caching might change while I'm in suspended animation.") |
Comments (48)
Comments are closed. |
Lets take a crack at this:
First, let me say that I have no idea why you would want to do this.
> Nobody expects that the mutex could secretly get released during the "…" (which is what would happen if the process got hibernated). That goes against everything mutexes stand for!
We could introduce the concept of hibernation eligibility. Holding a lock excludes you from this class, as you’re obviuosly doing something, right?.
> Consider, as another example, the case where you have a file that was opened for exclusive access.
Straightforward generalization of the previous point. Either you’re doing something, or you shouldn’t have the file locked. One could argue that the app is ill-written, or at least not hibernatable.
> One alternative would be to keep all objects that belong to a process-hibernated program still open. … Think of the amazing denial-of-service you could launch against somebody: Create and hold open a 20GB file, then hibernate the process and then delete the hibernation file. Ha-ha, you just created a permanently undeletable 20GB file.)
Even allowing all of this, Wouldn’t you be able to fix this fairly easily? File f is locked by process p, but when the OS resolves process p, it finds a missing hib file, so the processs no longer exists, therefore file f is not locked.
> And imagine the havoc if you could take the hibernated process and copy it to another machine, and then attempt to restore it there.
Now this could work, though I’d prefer some sort of managed environment like Java or .Net. Still not sure what the point of this is.
Note that semaphores and events don’t have "owners" so it’s impossible to tell whether any particular process "owns" an event or sempahore (and therefore should not be hibernated).
Believe it or not, people periodically ask me how to hibernate a process.
Raymond: any interesting anecdotes from the implementation of system hibernate? Any particular problems that showed up, or badly written apps?
The reason is of course, that no memory manager can know which program you will be using the för the next hour.
When I have used computers with a small amount of memory I have often been in situations where I’ve known thet "This thing I’m doing in Photoshop will require tons of memory. I could close InDesign to free up some memory, but then when I restarted it I wouldn’t get it back to exactly the state I was in. I wish I could swap it out to the hard drive and then just restore it when I was done."
The ability to minimize an application that you’re not actively using currently to the tray is a great feature. I would like to see applications that take it to the next level: when the application minimizes to the tray everything that doesn’t need to stay in memory gets stored on disc instead.
Jojjo: That’s already what happens when you minimize a program. All the unnecessary memory gets written to disk and only the bare minimum needed to keep the program alive stays around.
Some people don’t like this feature, though.
http://bugzilla.mozilla.org/show_bug.cgi?id=76831
Why do people want to hibernate their process? Here’s an example. "I have a program that does a long computation. Sometimes I want to pause the computation and resume it later. How can I hibernate my process?" These are people who don’t want to write a "Save/Resume" function; they are hoping there’s some magic function that will do it for them.
Interestingly enough there is a similar concept in the UNIX world – stopping a process.
Also, note that Unix allows you to delete a file without umm, deleting it ;)
Obviously this doesn’t persist across a reboot, and sockets can wind up timed out. I think that a lock will not be dropped, nor will any mutexes held (couldn’t be sure on this). But this, combined with swapping means you can sort of hibernate a process on nix, assuming no rebooting.
Excuse my ignorance, but is there a way to set a processes priority to very low so that it, in effect is hibernated?
How’bout simply "suspending" the process… sysinternals.com have a "pssuspend" utility that allows one to suspend/resume a process. That might not release all possible memory, but it does free up the CPU (as in the long calculation example above).
Personnally, when I want to free CPU from some job I simply change its priority to "Low" and let it soak up whatever idle is left over from my more important foreground process.
MilesArcher: hit Ctrl+Alt+Del, right-click the process in the processes list of the task manager, and select a priority using Set Priority.
I’d recommend only doing it on applications. Not on Explorer. Not on taskmgr. Not on anything you don’t recognize.
Actually, for this very reason it’s a shame that taskmgr doesn’t let you adjust priorities in the Applications tab as well – at least then you have a better idea of which app you’re messing with – because you can see the window title.
Cooney:
> We could introduce the concept of
> hibernation eligibility. Holding a lock
> excludes you from this class, as you’re
> obviuosly doing something, right?.
But then the user would be wondering why they can only hibernate some apps some of the time, and not all apps all of the time. And without any outward display, you’d never know if you could hibernate your app or not.
You can certainly suspend all threads of a process, but the next time somebody broadcasts a message, they will hang (since the system is waiting for the suspended process to respond to the message).
"…is there a way to set a processes priority to very low so that it, in effect is hibernated?"
There’s generally no reason to. Any reasonably well-written Windows application uses no CPU cycles if it has nothing to do. It is blocked in a GetMessage() call, and it won’t get any CPU time until an input message causes GetMessage() to return.
When booting up after hibernation, how does the system restore the state of hardware devices (e.g. such as serial ports baudrate) ?
Don’t you have all the same problems (e.g. locked files etc.) when hibernating a computer that has network connections? It might have an exclusive lock on a network drive, it might be waiting on a named pipe,…
I can see the appeal of a program hibernate. I hibernate my work laptop every night simply because it takes so long to startup, due to a slow hard drive, low memory, and tons of programs I run. A program hibernate and restore would have the same effect and let you bypass slow initialization and startup.
But for network resources, applications are ready to handle errors like, "Sorry, the connection to the server was lost." This is what happens if you unplug the network cable while you had an exclusive lock open.
Programs are not prepared for "Sorry, the mutex you acquired was lost."
Simon Cooke:
>recommend only doing it on applications. Not on Explorer. Not on taskmgr. Not on anything you don’t recognize.
>
>Actually, for this very reason it’s a shame that taskmgr doesn’t let you adjust priorities in the Applications tab as well – at least then you have a better idea of which app you’re messing with – because you can see the window title.
I think there is a good reason you cannot adjust priorities in the Application tab: that tab lists only some apps on your system, or better, only some windows: why TaskManager is not listed? if I have two IE windows, there are two lines in the application tab, but they correspond to one process, only.
However, you can use this trick: if you right click on an app in the application tab, and choose "Go To Process", the corresponding process is highlighted in the Process Tab (I think I discovered this trick during a seminar from David Solomon and Mark Russinovich…).
Has anyone ever come up with a way of transfering a process between machines?
I’ve got used to dragging windows between muliple monitors and sometimes I think it would be good to be able to drag a running program over to my notebook so I can keep where I am at.
Is this something that is theoretically impossible on Windows? How about other OSes?
OneNote seems to have the state-preserving thing done pretty well. It just pops up exactly where you left off.
Raymond: When you hit Win+D or Win+M to minimize all windows, do those processes get paged to disk as well? I would expect the drive to start thrashing if several processes started getting paged out at the same time.
The memory is put on standby but is not actually paged out until it’s needed for something else. (Win+D and Win+M are not exactly the same thing; I need to write an entry about the difference.)
Raymond: Please do write an entry. Your post in the newsgroup was really helpful on that.
Gianluca wrote:
>I think there is a good reason you cannot
>adjust priorities in the Application tab:
>that tab lists only some apps on your system,
>or better, only some windows: why TaskManager
>is not listed? if I have two IE windows,
>there are two lines in the application tab,
>but they correspond to one process, only.
That’s a reason why you might want to switch to the process tab to perform the operation… I’m not certain it’s a good enough reason to prohibit users from being able to perform the same operation on both tabs.
>However, you can use this trick: if you right
>click on an app in the application tab, and
>choose "Go To Process", the corresponding
>process is highlighted in the Process Tab (I
>think I discovered this trick during a
>seminar from David Solomon and Mark
>Russinovich…).
Thanks – that’s useful :)
Raymond:
> These are people who don’t want to write
> a "Save/Resume" function; they are hoping
> there’s some magic function that will do it
> for them.
There is such a magic function — it is called "Virtual PC." Sure your app runs a bit slower than if it was native, but you can instantly stop the VM and then resume it later on. You could probably even move the image across host machines, too, since it is all virtualised to the same hardware IIRC.
The reason why you can delete a file on UNIX that is in-use is due to how things are differently structured. Directory entries and the actual files are seperate entities (with the latter known as inodes). A directory entry just points to a particular inode. This means that multiple directory entries can point to the same inode. The inode keeps a reference count of how many point to it, and the actual underlying file is freed when the count reaches zero. Opening a file increases the count, and closing decreases it. Consequently you can open a file, delete the last remaining directory entry, and the file will still remain even though no directory entries point to it. On closing the file handle, the underlying storage is released. This is an easy way of getting temporary files, and ensuring they automatically go away if your process exits. Also no other process can mess with, except for the short duration when the name did exist before you managed to remove it. (Not surprisingly there have been a few vulnerabilities in that area with various programs). The ability to have multiple directory entries pointing to the same inode is used to make the . and .. entries in each directory. They aren’t actually special to the filesystem code. If you really want to mess with someone in a wierd ways and have root, go ahead and point those at other places on their filesystem randomly.
Windows (and DOS) keep the directory entry and the file as the same thing. Consequently you can’t delete a file that is in use. You can however rename one that is in use. I frequently wonder why so many programs on installation don’t rename old files out of the way, rather than insisting on a reboot so they can put the new version in place then.
Anon: Actually, NTFS has file indexes and directory entries seperate and has a reference count. I just don’t think that opening a file increments that count and binds itself to the index like that. Basically, at the filesystem level it’s like unix, but the interface to that system is 1:1 directory entry -> file.
NTFS has the inode concept – look up CreateHardLink and see if it reminds you of ln ;)
Lots of setup programs do use the rename-replace method for installing files, but it is not a panacea.
After all, somebody is still holding the old file in use, and presumably its because they’re still using it. So in order to safely perform a rename replace the thing being renamed has to be capable of running OK with a different version of itself loaded in a different process, and capable of (at minimum) detecting when some of its dependencies were updated OK (because they weren’t in use) but some of the others weren’t.
It is a lot of work, and not all applications want to go through the effort.
Raymond,
Has anyone done research on what it would take to snapshot a .NET process so that it might be eligible for hibernation-relocation.
It seems to me that a with regard to managed, barrier and distributed based computing that a generic snapshot enabled client application template in VS that delegates to the .NET framework the tasks of rebinding, relocationg and reactivating that this kind of app might be useful and eliminate a lot of common distributed computing implementation problems that occur because of reinvention of already solved distributed computing idioms.
I ask because I’m interested in the kind of idea of a mobile agent/OS that tags along with me whereever I go. Kind of like personal perferences store, but much more. More than just a roaming profile.
When an app goes crazy and starts using 100% CPU, and I don’t want to kill it immediately (for one of the reasons Norman mentioned), I usually suspend it with ‘ntsd -pv -p <pid>’.
Then I can examine the call stacks and if there’s some hope that it will eventually finish what it’s doing I might let it continue.
As for trimming on minimize, I suspect that in most cases it doesn’t have much impact on performance. Certainly not enough to explain 45 second lag on restore.
I once used kernel debugger to NOP out the code in win32k.sys that trims the working sets on minimize. I ran like this for several days and I didn’t notice much difference in performance. Things were painfully slow as usual :)
Indeed, the mixed case is the scary one, because you also have to worry about mixed versions in the *same* process. Suppose you have two DLLs, A.dll and B.dll. Process X has loaded A.dll but not B.dll. Now you want to upgrade them. What do you do? If you rename A.dll to A.bak and install a new A.dll and B.dll, then process X will get the old A.dll (now named A.bak) and the new B.dll. Gosh, I hope the new B.dll and old A.dll (now named A.bak) interoperate!
4/20/2004 10:54 AM Michael Geary:
> "…is there a way to set a processes
> priority to very low so that it, in effect
> is hibernated?"
>
> There’s generally no reason to. Any
> reasonably well-written Windows application
> uses no CPU cycles if it has nothing to do.
There sure is generally a reason to. A well-written Windows application with lots of stuff to do will use lots of CPU cycles. The less well-written user might temporarily (or permanently) decide that the user wishes to dedicate those CPU cycles to some other application. The user might not wish to kill the well-written Windows application but might wish to keep it alive and let it get some CPU cycles when the user goes to sleep.
There’s another reason too. A not-well-written Windows application, such as Internet Explorer or Word 2000 or sometimes Windows Explorer, will take 99% of the CPU cycles when no one can guess what it’s doing. The user might not want to kill the window because it might contain something the user has been working on. Or the user might not want to kill the window because even opening Word 2000 again on the same document will peg the CPU again. But the user might just want to refer to its contents occasionally, while dedicating CPU cycles to other applications that the user is also working on. More precise example: one Word 2000 window displaying the original text of a document and trying to take 100% of the CPU doing nothing, and a different Word 2000 window where the user is trying to edit a translation of the original. Less precise examples: Internet Explorer and occasionally Windows Explorer for no known reproducible reasons.
4/20/2004 8:20 AM Raymond Chen:
> Jojjo: That’s already what happens when you
> minimize a program. All the unnecessary
> memory gets written to disk and only the
> bare minimum needed to keep the program
> alive stays around.
> Some people don’t like this feature, though.
> http://bugzilla.mozilla.org/show_bug.cgi?id=76831
You’re right, but did you notice this reason why some people don’t like it — quoting from the cited page:
Roope Lehmuslehto 2002-04-05 08:01 PDT
< This is *very* annoying bug, because
< bringing Moz from traybar takes longer than
< launching Moz 100% from death ;).
Though it seems that Mozilla itself is partly responsible for that.
"Has anyone done research on what it would take to snapshot a .NET process so that it might be eligible for hibernation-relocation."
Managed or not, you have to deal with globally-exposed state. Like window handles. When the program is restored, how do you restore its window handles? What if that numeric window handle is already being used by somebody else?
In the Setup case, you can decide "I will replace all the files at once. If any are in use, I will replace none."
This works as long as there are no hibernated processes, since those processes are using a file without holding it open! It becomes impossible to detect that you’re about to create a mixed-DLL scenario.
(Even worse: While you’re hibernated, what happens if somebody deletes a DLL you were using?)
Well one thing about the program hibernation is if you don’t let it hibernate across a restart, it makes problems like mutexes and window handles a lot easier to deal with.
I think rather than a solution happening from, say, the task manager, an API level call could be done. The programs could have a "Tools/Hibernate." Of course, that would only make sense for MDI type programs. Programs like Word, Outlook, et cetera would just confuse users with such an option.
So basically, Adobe needs to write a way to save their state, cause really they’re the only programs that eat up SO much memory that this would be a real problem.
Window handles are a problem even without a restart. I discussed this in the main entry. Suppose there’s a program with a window. What happens to the window when you hibernate it? Is it destroyed? What if the program was relying on that window to receive messages from other programs? It can’t respond to those messages while it’s hibernated. Some of those messages may have been important. (E.g., "System setting X changed, please refresh.") And when you restore the program, what if the numerical value of its window handle was re-used for some other window?
Well for window handles, you keep the values reserved while the application is hibernated.
I guess I’m making the assumption that since there’s no restart, the system would be able to remember what’s hibernated, and can keep the resources open but not enumerable.
In the case of global messages, it can be the responsibility of the application to remember to check things, the system can keep messages it knows about in a queue to resend, or whatever other hacky thing to think of. Since the process would be calling an API, rather than the Process having such a thing imposed on it, it can know what it needs to check. Hell, even a proxy fake window that records system messages it knows about, plus any addition ones the application specifies and then resends the last one of those messages across. I haven’t thought it through completley (obviously :) ) but I could see it happen.
Of course, like I said, Abobe programs just saving their state would be enough for 99% of people wanting hibernation, because I think the logistical problems of distinguishing "Process" vs "Document View" are just too much for an end user to worry about.
You can try shifting the responsibility to the program (e.g., "all window messages sent to your window while you are hibernated will be lost"), but that doesn’t help programs which support thirdparty plugins (e.g., Internet Explorer). If a plugin is not "hibernation-aware" then you’re toast.
And it’s pretty easy to structure a program around a hibernation type of activity. Just making sure that things like writing settings aren’t married to the registry, but can be written out in any tree-type storage medium (XML <-> registry should be a piece of piss really). Then you just need to make sure that your objects can be streamed in and out at any time, and really you can shut down your program and restore state, and it’s cleaner than just writing out process memory.
I guess (being in games) I always hated games that saved games by writing out all of the allocated memory, and restored them by reading it and fixing up the pointers (often missing a few of the pointers in the process). Serialize everything, and you’ve solved both saving and reproducing any kind of state you need for your application.
"Indeed, the mixed case is the scary one, because you also have to worry about mixed versions in the *same* process. Suppose you have two DLLs, A.dll and B.dll. Process X has loaded A.dll but not B.dll. Now you want to upgrade them. What do you do? If you rename A.dll to A.bak and install a new A.dll and B.dll, then process X will get the old A.dll (now named A.bak) and the new B.dll. Gosh, I hope the new B.dll and old A.dll (now named A.bak) interoperate!"
Umm, isn’t this a problem anyway for current applications? Run an update while a program keeps a.dll locked. Then, before restarting, run a prorgam that depends on A.dll and B.dll. So if Microsoft are willing to accept that case, the hibernation case falls under the same category of "you’re on your own if you don’t restart."
Although NT does the inode thing under the hood, that isn’t particularly exposed in user space. I find it quite interesting how far apart win32 user space and the NT kernel are.
The versioning issue isn’t that big a deal. It is exactly how UNIX packages have been upgraded for decades. If it was a problem then you would also have to reboot your UNIX box on every update (like upgrading your web browser :-)
Raymond, if you are looking for another article to write, how about how damned difficult it is to write a new filesystem for NT. There is some good stuff at http://www.acc.umu.se/~bosse/ and the rant in "What ext2ifs can’t do" on http://uranus.it.swin.edu.au/~jn/linux/ext2ifs.htm
I want this feature. Always have. I sometimes have 50 IE and Acrobat (well only one for this MDI piece of crap)windows open. It takes a long time to download 25 MB pdfs and 5mb html files on my very expensive ($400) 33.6K modem (1 mb per 6 minutes at best). So as I don;t know if I’m even interested in it I preload lots of documents.
At this moment I’m reviewing all USAF AU Journals in the 70s and 80s. So I only have 4 IE windows open as the pages are small anmd download in under a sec (lovely clean html code). I also have Rise Of Nations open and minimised and it’s been minimised for 5 days.
I’m looking through NGs and through this site. I want RON and the IE windows to go away from my taskbar for a while. And to survive a reboot (as 5 days ago I rebooted with another RON minimised).
I don’t care about memory (it can use as much as it wants – just goes to swap). I care about the UI and managing different tasks (reading this page, reading NG, Reading Air University Review (for three days now), playing (ot not playing RON for 5 days). I can close OE in NGs and it remembers state well enough. The paper is now published so I’m about to open lots of web pages at smh.com.au. This will force me into the horrid scrolling buggy taskbar mode (because the taskbar remembers Z orders of the programs so you can’t click displayed buttons because the topmost button is scrolled out of sight).
Fix the taskbar but also allow programs to save state.
That’s exactly my point. The OS can’t do it; the program has to be involved. So if you want this feature, ask the company that makes each program to implement it. (I thought most games had a "save" feature anyway.)
Ok, I’d like to ask the makers of Internet Explorer to save state.
The RON programmers insist on using there DirectX ui. So R.c the taskbutton and choosing close merely restores the program and displays Are you sure? which takes time to page in enough to display the stupid warning. (but if one reboots it just closes). Still this isn’t save state but it gets rid of it.
I don’t specify how implementing should be done but imagine it to be done through an API with the program’s cooperation. In my senario IE shouldn’t have any files open (notepad doesn’t hold a file open that it is editing).
I think things like this should be defined at the UI level. It requires no programming to handle most mouse movement in most apps becase Windows does it if the app doesn’t. What would be worse is multiple ways of doing this.
I remember you hot debate about file copy dialogs. Despite your sensible opposition to a 500 button dialog the No To All feature snuck in (but not on the folder warning dialog, only file overwrite dialog). I not used it for real.
Since IE hosts thirdparty code, what should it do if there is an activex control on the page that doesn’t know how to save its state?
For most web pages (that don’t have lots of script), revisiting the page is usually good enough to restore the state.