Date: | December 22, 2006 / year-entry #423 |
Tags: | history |
Orig Link: | https://blogs.msdn.microsoft.com/oldnewthing/20061222-00/?p=28623 |
Comments: | 11 |
Summary: | Last time we looked at the format of 32-bit version resources, but I ended with the remark that what you saw purported to be the resources of shell32.dll but actually weren't. What's going on here? The resources I presented last time were what the resources of shell32.dll should have been, but in fact they aren't.... |
Last time we looked at the format of 32-bit version resources,
but I ended with the remark that what you saw purported to be
the resources of
The resources I presented last time were what the resources
of
A common mistake in generating 32-bit resources is to mistreat
the 0098 4C 00 // cbNode (node ends at 0x0088 + 0x004C = 0x00D40) 009A 2C 00 // cbData 009C 01 00 // wType = 1 (string data) 009E 43 00 6F 00 6D 00 70 00 61 00 6E 00 79 00 4E 00 61 00 6D 00 65 00 00 00 // L"CompanyName" + null terminator 00B6 00 00 // padding to restore alignment 00B8 4D 00 69 00 63 00 72 00 6F 00 73 00 6F 00 66 00 74 00 20 00 43 00 6F 00 72 00 70 00 6F 00 72 00 61 00 74 00 69 00 6F 00 6E 00 00 00 // L"Microsoft Corporation" + null terminator 00E4 // no padding needed In real life, the data take the following form: 0098 4C 00 // cbNode (node ends at 0x0088 + 0x004C = 0x00D40) 009A 16 00 // cchData (!) 009C 01 00 // wType = 1 (string data) ...
These malformed version resources manage to get away without
crashing too horribly because the standard format of version resources
uses string data only in leaf nodes.
Therefore, the incorrect
Until somebody tries to read, say,
They're just lucky that nobody actually asks for that.
But wait, there's more.
Somebody who calls
the
As a result of this mess, the |
Comments (11)
Comments are closed. |
Are you saying that rc.exe has had this bug for 15 years?
Now that’s bad-memory lane, I tripped over this problem several times. In the Win98 era before VerQueryValue was fixed, there used to be a KB article about the problem. I never knew what the core problem was, though. At the time, there must not have been a lot of Unicode resources on files.
No resource compiler today could get away with writing a byte count because VerQueryValueW returns this count directly in its puLen parameter. puLen is — you guessed it — documented as *character* count, and has been for at least a decade.
If you try to call GetFileVersionInfo for UPX-compressed executable, it will cause a crash in krnl386.exe under Windows 98 (the bug is corrected in Windows XP). So, if you want your application to work under all versions of Windows, you may want to parse the resources yourself without relying on (those buggy) Win 32 API functions.
Sorry you’re right, it’s the program that has to do the copying. VerQueryValueW tells the program how many characters to copy.
Does VerQueryValueW figure out the correct number of characters even when cbData isn’t a byte count? (OK, I should experiment instead of asking. So far I’ve only needed this on Windows CE where it works well enough. VerQueryValueW reports the correct number of characters there (after the .rc file has been hand edited). I didn’t look at the cbData field in the binary.)
> But wait, there’s more. Somebody who calls the
> VerQueryValueA function expects to have the
> version string returned as ANSI, so
> VerQueryValueA needs to know how many
> characters to convert from Unicode to ANSI.
> If VerQueryValue trusted the erroneous cbData
> value, then ANSI callers would get only half
> the data they were expecting.
I think there’s more.
(1) Somebody who calls the VerQueryValueW function expects to have the version string returned as Unicode, so VerQueryValueW needs to know how many characters to copy. If VerQueryValue trusted the erroneous cbData value, then Unicode callers would get only half the data they were expecting.[*]
(2) Somebody who calls the VerQueryValueA function expects to have the version string returned as ANSI, so VerQueryValueA needs to know how many characters to convert from Unicode to ANSI. If VerQueryValue trusted the erroneous cbData value, then ANSI callers would get some random fraction of the data they were expecting. When a Unicode character converts to a two-byte ANSI character, the caller might get both bytes. Though this is just hypothetical because we can’t really test it — VerQueryValue knows not to trust cbData so a test would only find out what VerQueryValue actually does.
[* If the data include surrogate pairs then the fraction might be random.]
Why is the length embedded at all? Redundant information.
Are arbitrary binary data allowed in string fields? How can a null
terminated string have embedded nulls? A terminating NIL char could
have been used to terminate the string instead of a byte count integer.
field. You can see embedded NULs in the 16-bit version resources a few
days ago. (More evidence that people don’t actually read my entries.)
-Raymond]
> You can see embedded NULs in the 16-bit
> version resources a few days ago.
The ones I noticed were intended to be terminators. I didn’t notice any that weren’t intended to be terminators.
Since ordinary string resources don’t automatically get NUL terminators appended, programmers have to code the terminators themselves[*], and some programmers didn’t notice that version string resources are different. Some of those programmers produced some versions of Visual C++, so a lot of executables have redundant NUL terminators. I never complained about this very minor bug, a very slight waste of memory with no other consequences. Had I been involved, I would have given priority to fixing more serious bugs than to this one. Though I don’t have any complaint about its having been fixed either.
Anyway, do you know of cases where NULs were intended to be embedded rather than intended to be terminators?
Putting binary data in fields that are labelled as non-binary sometimes causes bugs. For example BSTRs sometimes get converted to ANSI without the programmer noticing because the programmer’s code page is different from the customer’s code page.
Yes and no. I coded stuff using string syntax with embedded nulls in RCDATA resources, i.e. binary resources. I did not do so in STRING resources.
In a setting having nothing to do with resources, my discovery that sometimes BSTRs get converted to ANSI did come the hard way, but I luckily discovered it before the product shipped and I’ve never repeated that mistake. I learned belatedly that the MSDN section that I had read included an invisible restriction (visible in some other pages that I belatedly discovered) so it didn’t apply to the particular code I had written.
Previous blogs in this series: 0: A long journey begins with the zeroeth step One of the first things