Sometimes it feels like the effort isn’t even appreciated

Date:September 7, 2007 / year-entry #334
Tags:other
Orig Link:https://blogs.msdn.microsoft.com/oldnewthing/20070907-00/?p=25223
Comments:    28
Summary:Some time ago, the application compatibility folks found a program that was corrupting the heap, and they applied a fix that worked around the specific type of corruption that the program performed. And then a bug came on that same program. It was a heap corruption failure during the program's processing of global destructors. The...

Some time ago, the application compatibility folks found a program that was corrupting the heap, and they applied a fix that worked around the specific type of corruption that the program performed.

And then a bug came on that same program. It was a heap corruption failure during the program's processing of global destructors.

The authors of that program were so clever, they found a way around the compatibility fix and managed to corrupt the heap anyway!

Update: To clarify, there was no updated version of the program. (That's why I wrote "that same program" and not "an updated version of that program".) There was a bug in XYZ Version 2.1. We added a compatibility fix for it. And then later, another bug came in, also for XYZ Version 2.1 showing that the compatibility fix wasn't good enough. We tried to fix their heap corruption, but they were too clever and corrupted it in another way in a different part of the program.


Comments (28)
  1. Steven says:

    Impressive, but was this a new version of said program that ended up even worse? If not, it would appear that the initial workaround was simply insufficient to work around the stupidity of the original developers.

  2. Fred says:

    So basically, they had a bug, you created a shim to "fix" it. They updated their program (presumably the presence of the shim masked the bug so they saw no need to fix it) in such a way that the shim’s heuristics failed and the bug re-emerged.

    Yes, the bug was their fault, but maybe if the shim had not been put in place they could have spotted the bug and released a fix.

    Why was the shim put there anyway? Your standard excuse of "What if they aren’t around to fix the bug?" doesn’t seem to apply here, seeing as they were around enough to produce a new version.

    It seems to me that this is a case where trying to fix somebody else’s bugs just causes further problems.

  3. pavolmarko says:

    Fred, one possibility would be that the bug existed but wasn’t noticeable until changes in an updated version of Windows have been made (it’s actually pretty probable because Raymond wrote "Application Compatibility") – so the users of the program would have said "Windows update broke my program!". Which would not be good.

  4. Cody says:

    Since obviously the company’s still doing active development on the product, I wonder how they (would|have) respond[ed] to the notification that their program is very buggy.

  5. BryanK says:

    Fred: that’s why developers should (IMO) be using Application Verifier — it’ll tell you when you’re doing stuff that shims have been written for in the past.  It’ll also tell you when you’re doing lots of other problematic things.

    http://www.microsoft.com/technet/prodtechnol/windows/appcompatibility/appverifier.mspx

    Unfortunately it’s not part of Visual Studio, so most people probably don’t even know it exists…

  6. Nathan says:

    Is a valid attack vector. Maybe not quite as exploitable as a buffer overflow, but it is exploitable. (For a while, I didn’t think so, but I was wrong. I was given a few high-priority bugs found by an external security researcher in a product I worked on that involved heap corruption.) That might be the right hammer to hit the party at fault with.

  7. John says:

    Sometimes I wish I worked on an important/popular application.  I would go out of my way to do bad things (such as corrupting the heap) just to get some kind of sick pleasure out of knowing somebody at Microsoft was working hard on fixing my "mistakes".

  8. … so… um… doesn’t this mean that the compatibility fix was just inadequately tested?

  9. Triangle says:

    "… so… um… doesn’t this mean that the compatibility fix was just inadequately tested?"

    No, it means the program was a piece of junk.

  10. Tim Smith says:

    Maurits,

    You fail programming 101.  Just as you can’t prove that software of any significant size has no bugs, you can’t prove that a shim for software of any significant size will fix the problem.

  11. 2.2 coming soon says:

    I tried to use Application Verifier, but it was to hard to install. If you (as an official representative of microsoft) want people to use it, please make it easier to install & use.

  12. Shredded Peas says:

    Yes. Raymond, could you personally make app verifier easier to use?  Also, maybe you could buy me some coffee.  And wash my car. :)

    Thanks!

  13. nksingh says:

    @coming soon:

    How hard is it?  I’ve used it a lot recently.  It’s nice and GUI-enabled, and the defaults are pretty sensible.  Just select your EXE(s), turn on all the checks and let ‘er rip.  Is there something specific that’s troubling you?

  14. Jalf says:

    "… so… um… doesn’t this mean that the compatibility fix was just inadequately tested?"

    No, it means the program was a piece of junk.

    Seems to mean both to me. True, the program was obviously a piece of junk, but the compatibility fix also can’t have been adequately tested, since it didn’t actually fix (all) the heap corruption that occurred.

  15. Tim Smith says:

    While there is a chance that MS didn’t test the shim well enough, I think it is far more likely that it is just a simple question of scalable testing.

    I would be quite shocked if MS has the resources to do extensive testing on every application that requires shims.

    Lets say that the second issue would only happen in 1% of the usage patterns for the given application.  Most applications of any size have millions of different usage patterns.    (If you think I am off my rocker, take a look at all the bug reports that detail complex reproduction steps.)  If MS created 100 test cases, there is still a 36% chance that the 1% of the usage patterns that show the failure never happen during testing.  (Assuming pure random)

    Now the OS goes into beta testing.  You now have real people who use the application for their work.  Even with just a magnitude change in the number of usage patterns (i.e. 1000), the chance of seeing the bug increases dramatically to 99.006%.  Another magnitude increase to 10000 results in a chance of NOT seeing the problem to 2.248e-42.

  16. Dewi Morgan says:

    Surprised the "official representative of microsoft" comment was allowed to go past uncommented on. Though I think it was directed at BrianK, not Raymond.

    It is such a shame that this world is not one in which we can name and shame without repercussions – I’d love for MS to put up a big list of "guilty" programs. It might well shame people into coding stuff right. If they included guilty MS programs in the list, nobody could complain they were biased… but I still think it’s an idea which would never fly.

  17. Triangle says:

    Dewi Morgan: What evidence would microsoft use for backing up the "bad program" claims? "We reverse-engineered your program and found these bugs and these code paths that use implementation details" ?

  18. European says:

    In the long run, wouldn’t it be better just to make the application crash? This would force the vendor to fix the bug.

    [If you don’t have a short run, you don’t have a long run either. -Raymond]
  19. Aaargh! says:

    It makes one wonder how many of these "fixes" are there in windows ?

    How much of the time is the OS spending checking for all these exceptions and trying to figure out if a ‘fix’ needs to be applied.

    And how the hell do you maintain code with so many exceptions ?

  20. Aaargh! says:

    "In the long run, wouldn’t it be better just to make the application crash? This would force the vendor to fix the bug."

    Sure, but Microsoft isn’t trying to create the best OS out there, they are trying to create the most profitable OS (they are, after all, a for-profit company), and if application XY doesn’t work, users of application XY aren’t going to upgrade to the latest windows version.

  21. James says:

    European: the trouble with that is that when an application works fine on Windows 2000 but crashes on Windows XP, users will conclude it’s Windows’s fault rather than the application’s: Windows changed something in the new version, that change is causing problems ==> the new version of Windows is causing problems.

    It’s certainly understandable – and if upgrading Windows escalates a bug from ‘doesn’t actually cause any problems’ to ‘crashes the app every time’, without any change at all on the application’s part, I’d say it is at least partly right: Windows has indeed made the situation much, much worse for users. There’s a reason MS puts so much effort into making sure this doesn’t happen!

  22. Dave says:

    that’s why developers should (IMO) be using Application Verifier — it’ll tell you when

    you’re doing stuff that shims have been written for in the past.  It’ll also tell you when you’re

    doing lots of other problematic things.

    One caveat about this, Application Verifier makes non-reversible (without a lot of effort) changes to your system config that can render the system unusable for further debugging/development work.  Don’t install it on your main work machine, use a VM or a scratch machine instead.

    (This issue has been covered here in the past, although I wish the A-V download page and FAQ warned about it, reinstalling the OS in order to recover was a right pain).

  23. European says:

    @James & @aargh!

    Point taken. But, as someone pointed out, these "shims" must have an impact on the overall performance of the system. This price in performance is being paid by all users of Windows, you and me too, regardless of whether we actually profit from the "shim" or not.

  24. BryanK says:

    Wow, I’m an official representative of Microsoft, just because I’ve followed this type of post in the past and saw Raymond’s link to App Verifier?  (And then copied it here.)  Yikes.  Here I figured I was just some random guy who happened to actually read the past discussions.

    Shows what I know…

    (To be fair, perhaps "2.2 coming soon" was thrown off by the fact that my name has a link behind it, since most of the other MS bloggers also have links behind their names.  For reference, all that the link means is that the comment was not posted anonymously: I’ve created an actual account here, and logged into it before posting.  Most of the other MS bloggers also have links, but that’s because they also have their own account.)

  25. Michiel says:

    I’m surprised by the assumption that Microsoft can test every code path in all software.

    I’m responsible for a rather popular application, or rather utility – it supports a HW device. While I assume you can have an iPod in Redmond to test AppCompat for iTunes, I cannot expect Microsoft to buy a full range of our devices. Yet there are bits in our code that are only accessible if you have a certain device with certain firmware versions.

  26. Mihai says:

    Maybe that was not a bug, but a feature. So your "fix" disabled their feature, so it is normal that they tried to go around that :-D

  27. Cooney says:

    Triangle:

    What evidence would microsoft use for backing up the "bad program" claims? "We reverse-engineered your program and found these bugs and these code paths that use implementation details" ?

    Basically. You could phrase it differently – we determined that program X does these bad things on XP and blows up in this way. We have a shim in place, etc, etc.

  28. microbe says:

    So the fix didn’t really fix it. So both sides are to blame: the original developers who introduced a bug and the compat folks who didn’t really find a fix.

Comments are closed.


*DISCLAIMER: I DO NOT OWN THIS CONTENT. If you are the owner and would like it removed, please contact me. The content herein is an archived reproduction of entries from Raymond Chen's "Old New Thing" Blog (most recent link is here). It may have slight formatting modifications for consistency and to improve readability.

WHY DID I DUPLICATE THIS CONTENT HERE? Let me first say this site has never had anything to sell and has never shown ads of any kind. I have nothing monetarily to gain by duplicating content here. Because I had made my own local copy of this content throughout the years, for ease of using tools like grep, I decided to put it online after I discovered some of the original content previously and publicly available, had disappeared approximately early to mid 2019. At the same time, I present the content in an easily accessible theme-agnostic way.

The information provided by Raymond's blog is, for all practical purposes, more authoritative on Windows Development than Microsoft's own MSDN documentation and should be considered supplemental reading to that documentation. The wealth of missing details provided by this blog that Microsoft could not or did not document about Windows over the years is vital enough, many would agree an online "backup" of these details is a necessary endeavor. Specifics include:

<-- Back to Old New Thing Archive Index