Date: | September 7, 2007 / year-entry #334 |
Tags: | other |
Orig Link: | https://blogs.msdn.microsoft.com/oldnewthing/20070907-00/?p=25223 |
Comments: | 28 |
Summary: | Some time ago, the application compatibility folks found a program that was corrupting the heap, and they applied a fix that worked around the specific type of corruption that the program performed. And then a bug came on that same program. It was a heap corruption failure during the program's processing of global destructors. The... |
Some time ago, the application compatibility folks found a program that was corrupting the heap, and they applied a fix that worked around the specific type of corruption that the program performed. And then a bug came on that same program. It was a heap corruption failure during the program's processing of global destructors. The authors of that program were so clever, they found a way around the compatibility fix and managed to corrupt the heap anyway! Update: To clarify, there was no updated version of the program. (That's why I wrote "that same program" and not "an updated version of that program".) There was a bug in XYZ Version 2.1. We added a compatibility fix for it. And then later, another bug came in, also for XYZ Version 2.1 showing that the compatibility fix wasn't good enough. We tried to fix their heap corruption, but they were too clever and corrupted it in another way in a different part of the program. |
Comments (28)
Comments are closed. |
Impressive, but was this a new version of said program that ended up even worse? If not, it would appear that the initial workaround was simply insufficient to work around the stupidity of the original developers.
So basically, they had a bug, you created a shim to "fix" it. They updated their program (presumably the presence of the shim masked the bug so they saw no need to fix it) in such a way that the shim’s heuristics failed and the bug re-emerged.
Yes, the bug was their fault, but maybe if the shim had not been put in place they could have spotted the bug and released a fix.
Why was the shim put there anyway? Your standard excuse of "What if they aren’t around to fix the bug?" doesn’t seem to apply here, seeing as they were around enough to produce a new version.
It seems to me that this is a case where trying to fix somebody else’s bugs just causes further problems.
Fred, one possibility would be that the bug existed but wasn’t noticeable until changes in an updated version of Windows have been made (it’s actually pretty probable because Raymond wrote "Application Compatibility") – so the users of the program would have said "Windows update broke my program!". Which would not be good.
Since obviously the company’s still doing active development on the product, I wonder how they (would|have) respond[ed] to the notification that their program is very buggy.
Fred: that’s why developers should (IMO) be using Application Verifier — it’ll tell you when you’re doing stuff that shims have been written for in the past. It’ll also tell you when you’re doing lots of other problematic things.
http://www.microsoft.com/technet/prodtechnol/windows/appcompatibility/appverifier.mspx
Unfortunately it’s not part of Visual Studio, so most people probably don’t even know it exists…
Is a valid attack vector. Maybe not quite as exploitable as a buffer overflow, but it is exploitable. (For a while, I didn’t think so, but I was wrong. I was given a few high-priority bugs found by an external security researcher in a product I worked on that involved heap corruption.) That might be the right hammer to hit the party at fault with.
Sometimes I wish I worked on an important/popular application. I would go out of my way to do bad things (such as corrupting the heap) just to get some kind of sick pleasure out of knowing somebody at Microsoft was working hard on fixing my "mistakes".
… so… um… doesn’t this mean that the compatibility fix was just inadequately tested?
"… so… um… doesn’t this mean that the compatibility fix was just inadequately tested?"
No, it means the program was a piece of junk.
Maurits,
You fail programming 101. Just as you can’t prove that software of any significant size has no bugs, you can’t prove that a shim for software of any significant size will fix the problem.
I tried to use Application Verifier, but it was to hard to install. If you (as an official representative of microsoft) want people to use it, please make it easier to install & use.
Yes. Raymond, could you personally make app verifier easier to use? Also, maybe you could buy me some coffee. And wash my car. :)
Thanks!
@coming soon:
How hard is it? I’ve used it a lot recently. It’s nice and GUI-enabled, and the defaults are pretty sensible. Just select your EXE(s), turn on all the checks and let ‘er rip. Is there something specific that’s troubling you?
—
"… so… um… doesn’t this mean that the compatibility fix was just inadequately tested?"
No, it means the program was a piece of junk.
—
Seems to mean both to me. True, the program was obviously a piece of junk, but the compatibility fix also can’t have been adequately tested, since it didn’t actually fix (all) the heap corruption that occurred.
While there is a chance that MS didn’t test the shim well enough, I think it is far more likely that it is just a simple question of scalable testing.
I would be quite shocked if MS has the resources to do extensive testing on every application that requires shims.
Lets say that the second issue would only happen in 1% of the usage patterns for the given application. Most applications of any size have millions of different usage patterns. (If you think I am off my rocker, take a look at all the bug reports that detail complex reproduction steps.) If MS created 100 test cases, there is still a 36% chance that the 1% of the usage patterns that show the failure never happen during testing. (Assuming pure random)
Now the OS goes into beta testing. You now have real people who use the application for their work. Even with just a magnitude change in the number of usage patterns (i.e. 1000), the chance of seeing the bug increases dramatically to 99.006%. Another magnitude increase to 10000 results in a chance of NOT seeing the problem to 2.248e-42.
Surprised the "official representative of microsoft" comment was allowed to go past uncommented on. Though I think it was directed at BrianK, not Raymond.
It is such a shame that this world is not one in which we can name and shame without repercussions – I’d love for MS to put up a big list of "guilty" programs. It might well shame people into coding stuff right. If they included guilty MS programs in the list, nobody could complain they were biased… but I still think it’s an idea which would never fly.
Dewi Morgan: What evidence would microsoft use for backing up the "bad program" claims? "We reverse-engineered your program and found these bugs and these code paths that use implementation details" ?
In the long run, wouldn’t it be better just to make the application crash? This would force the vendor to fix the bug.
It makes one wonder how many of these "fixes" are there in windows ?
How much of the time is the OS spending checking for all these exceptions and trying to figure out if a ‘fix’ needs to be applied.
And how the hell do you maintain code with so many exceptions ?
"In the long run, wouldn’t it be better just to make the application crash? This would force the vendor to fix the bug."
Sure, but Microsoft isn’t trying to create the best OS out there, they are trying to create the most profitable OS (they are, after all, a for-profit company), and if application XY doesn’t work, users of application XY aren’t going to upgrade to the latest windows version.
European: the trouble with that is that when an application works fine on Windows 2000 but crashes on Windows XP, users will conclude it’s Windows’s fault rather than the application’s: Windows changed something in the new version, that change is causing problems ==> the new version of Windows is causing problems.
It’s certainly understandable – and if upgrading Windows escalates a bug from ‘doesn’t actually cause any problems’ to ‘crashes the app every time’, without any change at all on the application’s part, I’d say it is at least partly right: Windows has indeed made the situation much, much worse for users. There’s a reason MS puts so much effort into making sure this doesn’t happen!
One caveat about this, Application Verifier makes non-reversible (without a lot of effort) changes to your system config that can render the system unusable for further debugging/development work. Don’t install it on your main work machine, use a VM or a scratch machine instead.
(This issue has been covered here in the past, although I wish the A-V download page and FAQ warned about it, reinstalling the OS in order to recover was a right pain).
@James & @aargh!
Point taken. But, as someone pointed out, these "shims" must have an impact on the overall performance of the system. This price in performance is being paid by all users of Windows, you and me too, regardless of whether we actually profit from the "shim" or not.
Wow, I’m an official representative of Microsoft, just because I’ve followed this type of post in the past and saw Raymond’s link to App Verifier? (And then copied it here.) Yikes. Here I figured I was just some random guy who happened to actually read the past discussions.
Shows what I know…
(To be fair, perhaps "2.2 coming soon" was thrown off by the fact that my name has a link behind it, since most of the other MS bloggers also have links behind their names. For reference, all that the link means is that the comment was not posted anonymously: I’ve created an actual account here, and logged into it before posting. Most of the other MS bloggers also have links, but that’s because they also have their own account.)
I’m surprised by the assumption that Microsoft can test every code path in all software.
I’m responsible for a rather popular application, or rather utility – it supports a HW device. While I assume you can have an iPod in Redmond to test AppCompat for iTunes, I cannot expect Microsoft to buy a full range of our devices. Yet there are bits in our code that are only accessible if you have a certain device with certain firmware versions.
Maybe that was not a bug, but a feature. So your "fix" disabled their feature, so it is normal that they tried to go around that :-D
Triangle:
What evidence would microsoft use for backing up the "bad program" claims? "We reverse-engineered your program and found these bugs and these code paths that use implementation details" ?
Basically. You could phrase it differently – we determined that program X does these bad things on XP and blows up in this way. We have a shim in place, etc, etc.
So the fix didn’t really fix it. So both sides are to blame: the original developers who introduced a bug and the compat folks who didn’t really find a fix.