On occasion, you might notice that every window on the desktop flickers and repaints itself. One of the causes for this is a simple null handle bug.
The InvalidateRect
function is one you're probably well-familiar with. It is used to indicate to the window manager that the pixels of a particular window are no longer current and should be repainted. (You can optionally pass a rectangle that specifies a subset of the window's client area that you wish to mark invalid.) This is typically done when the state of the data underlying the window has changed and you want the window to repaint with the new data.
If however you end up passing NULL
as the window handle to the InvalidateRect
function, this is treated as a special case for compatibility with early versions of Windows: It invalidates all the windows on the desktop and repaints them. Consequently, if you, say, try to invalidate a window but get your error checking or timing wrong and end up passing NULL
by mistake, the result will be that the entire screen flickers.
Even more strangely, passing NULL
as the first parameter ValidateRect
has the same behavior of invalidating all the windows. (Yes, it's the "Validate" function, yet it invalidates.) This wacko behavior exists for the same compatibility reason. Yet another example of how programs rely on bugs or undocumented behavior, in this case, the peculiar way a NULL
parameter was treated by very early versions of Windows due to lax parameter validation. Changing nearly anything in the window manager raises a strong probability that there will be many programs that were relying on the old behavior, perhaps entirely by accident, and breaking those programs means an angry phone call from a major corporation because their factory control software stopped working.
Is this behavior also present in DWM?
Today, I learned there is a ValidateRect() API!
Is there a way to test what version of windows an executable was built/linked on? If so, it could be possible to slowly migrate these crufty compatibility "hacks" out of windows by following the documented behaviour to the letter if the application was built on vista.
What would be really great would be the ability to turn this off globally, then turn it back on on an app-by-app, or dll-by-dll basis… Find the programs where it happens and complain. <grin>
When I use DesktopSidebar (www.desktopsidebar.com) with the analog clock on, there’s some pretty bad flicker, or with the analog clock off, but with SQL Server 2000 Query Analyzer open… It would be nice to be able to isolate what programs are causing this, and either stop using them or get them fixed.
does that hold true even today, say in Windows Server 2003?
do you still have to do this because there are there still programs that have those problems?
Raymond,
I read your blog daily, and one thing that you often bring up are issues, like this, that used to make sense, say back in the old Win16 days, but today should largely be considered ‘deprecated’ (as the Java camp would say). To help developers, wouldn’t it make alot of sense for MS to ship some kind of ‘lint’ version of the system libraries that a developer could link against (either at link- or run-time), and be provided with some kind of loging output that would emit warnings saying ‘This is bad, please don’t do this’. Say, for calling Sleep, or other ‘API smells’?
Raymond,
Your last comment brings up an interesting point. You’ve blogged a lot about the extreme steps that MSFT has taken to ensure that old commerical software works with new versions of Windows. It seems like in-house software might be more likely to be hacked together and violate rules (though maybe that’s not true, if it tends to be less sophisticated). Granted, it’s at least presumptively easier for in-house software to be fixed.
Do you have any good stories about compatibility problems with in-house software?
A lot of these articles were inspired by problems with in-house software. I just don’t call them out as such. Disabling the desktop window, using the Shell Folders key, windows with the wrong owner (very common)…
And in reality, in-house software is usually HARDER to fix than commercial software, since the people who write in-house software tend not to be full-time software engineers. It’s "Hey, Bob, you know some programming, right? Could you write us an app that does X?" And then Bob leaves the company – maybe if you’re lucky Bob leaves the source code behind, but it’s badly commented and nobody can figure out what it does much less how to fix it…
So why not invalidate just the windows in the calling process instead of all the windows?
The biggest problem with Windows right now, security aside, is that the experience degrades over time. You guys need to address that because right now I know nobody that loves Windows.
I change laptops every two years, and the last six months of each laptop using Windows are usually unpleasant.
Dejan
In modern Windows, is there any situation that would warrent invalidating all desktop windows? If so, what’s the ‘correct’ way to do that. If not, why support the old and irrelevant Win16 behavior?
I suppose it’s not too hard to imagine some app with multiple processes. If each process has windows that need to be refreshed, the app might simply invalidate the null window rather than each window individually (which it probably doesn’t keep track of).
Of course I wouldn’t expect a commercial app to behave like this, but I’ve seen some embedded systems that might do such a thing.
Jordan: see the Application Verifier, which was formerly part of the Application Compatibility Toolkit. Or, consider BoundsChecker from Compuware.
This is unrelated, but I found a fun thing in MSDN: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dngenlib/html/msdn_manamemo.asp
"Using the above example of a large flat-file database, consider a database file housing 1,000,000 records of 125 bytes each. The file size necessary to store this database would be 1,000,000 * 125 = 125,000,000 bytes. To read a file that large would require an extremely large amount of memory."
How time flies..
640k ram assumed?
Why was there not an API function created, like ThisProgramBehavesItself() or IAlwaysReadTONT() or even ILoveRaymondChen() that sets a flag in Windows so it knows it doesn’t need to use this BC compatibility workaround hack? Perhaps with an argument that indicates the level of strict API conformance, so it can be used for other BC’s as well.
I think the consequences of this would be better understandable if you rename the function TrustMeIKnowWhatImDoing() or ImOnYourSideNoReally() – app developers will only use it if they get some benefit from it, and once you can obtain a benefit from it, developers will like in order to get that benefit. Look at WHQL, where this has already happened in spades.
If you really wanted to do this, you’d need to perform some sort of lazy evaluation where programs get black marks against them whenever they do something bad (this doesn’t have the overhead it would seem, you need to do the parameter-check anyway, and the heavyweight black-book op only occurs with misbehaving apps). Once you rack up enough penalty points, you get dinged for it.
Now, do you want to be the person to implement this? Do you want to be the manager to sign off development time and money to it?
As an (optionally deletable if it’s inappropriate) addendum, I should add that my first idea for a descriptive function name was IWontComeInYourMouth(), that sort of says it all :-).
"If you really wanted to do this, you’d need to perform some sort of lazy evaluation where programs get black marks against them whenever they do something bad."
Why? I think that just introduces even more bugs.
"Once you rack up enough penalty points, you get dinged for it."
That could make things look like undefined behaviour.
"Now, do you want to be the person to implement this?"
Sure!
"Do you want to be the manager to sign off development time and money to it?"
I don’t want to be a product manager at all, so no.
The version of Remote Desktop on my laptop (that I am not currently using so I can’t look it up) seems to refresh all local and remote windows when it establishes a connection (it has other faults too, but I can only find the web connection download for 5.2.3790.1830).
Mike: I have my doubts about the Application Verifier. I spent some time testing it on one of my apps, and to my surprise, it found one bug, deep within ExitProcess(). After pulling my hair out for a while, I tried running it on notepad.exe and on calc.exe and it barfed in ExitProcess() there, too. Near as I can tell, the bug seems to have something to do with the Windows input method editor (msctfime.dll) trying to access TLS data that doesn’t exist.
So given that Application Verifier claims that there are bugs even in the standard Windows DLLs and EXEs, I’d take pretty much everything it says with a grain of salt (or two). Code-reading and careful debugging are still the best paths to quality applications.
Sean W: Or perhaps Application Verifier found a bug in MSCTFIME? (Personally, I think that’s more likely.)
Yes, Raymond, based on what I saw, it looks the bug is in msctfime.ime. If you have any contact with that group, you might want to mention that their code doesn’t pass the Application Verifier’s checks.
But MSCTFIME is still a standard Windows component, so my statement still stands: Take AV’s results with a grain of salt, as it can find bugs in your apps where there are none and find bugs in Windows code that should have already been verified; and make sure you do your own code-reading and debugging as well, as good code-reading and good debugging are likely to catch more bugs than automated tools will.
Control Panel -> Accessability Options -> Check "Use Filter Keys" and apply it. Uncheck it and apply.
That causes an everything-flicker at each change (here, anyway), should you wish to see the effect.
Not every repaint is the result of InvalidateWindow(NULL). If you change a global setting, then most windows will refresh so they can get back in sync with the new setting.
Wednesday, March 08, 2006 12:20 PM by Sean W.
> so my statement still stands
It does not.
> Take AV’s results with a grain of salt, as
> it can find bugs in your apps where there
> are none
It can, but your statement about the reason and example does not stand.
> and find bugs in Windows code that should
> have already been verified;
Indeed, it finds bugs in Windows code that should have already been verified. Blue screens also find bugs in Windows code that should have already been verified. Corrupted disk partitions also find bugs in Windows code that should have already been verified.
Application Verifier doesn’t find all bugs, but it helps, and its help is good.
Checked builds don’t find all bugs, but they used to help, and their help was good. Let’s hope Application Verifier doesn’t disappear in some dungeon the way checked builds have.
The app verifier doesn’t work for me at all…
"An error has occurred in this application please exit the application and start gain." :’-(
Random, I was just thinking about the "strict mode" idea, and it occured to me that I forgot about the scenario where you link a DLL which isn’t exactly conformant to the same level.
PrintDlg() flickers desktop. Why?
PingBack from http://methylblue.com/blog/?p=13
That pingback on Linux is worth reading!
It just won’t die.