Date: | August 24, 2004 / year-entry #317 |
Tags: | code |
Orig Link: | https://blogs.msdn.microsoft.com/oldnewthing/20040824-00/?p=38063 |
Comments: | 25 |
Summary: | Even though a value is stored in the registry as REG_SZ, this doesn't mean that the value actually ends with a proper null terminator. At the bottom, the registry is just a hierarchically-organized name/value database. And you can lie and get away with it. Lots of people lie about their registry data. You'll find lots... |
Even though a value is stored in the registry as REG_SZ, this doesn't mean that the value actually ends with a proper null terminator. At the bottom, the registry is just a hierarchically-organized name/value database. And you can lie and get away with it. Lots of people lie about their registry data. You'll find lots of things that should be REG_DWORD stored as a four-byte REG_BINARY. (This is in part a holdover from Windows 95's registry, which didn't support REG_DWORD.) One of the most insidious lies is to lie about the length of a string you're writing to the registry. Consider the following program: #include <windows.h> #include <stdio.h> int __cdecl main(int argc, char **argv) { RegSetValueExW(HKEY_CURRENT_USER, L"Scratch", 0, REG_SZ, (BYTE*)L"12", 2); DWORD cb = 0; RegQueryValueExW(HKEY_CURRENT_USER, L"Scratch", NULL, NULL, NULL, &cb); printf("Size is %d bytes\n", cb); WCHAR sz[2]; sz[0] = 0xFFFF; sz[1] = 0xFFFF; cb = sizeof(sz[0]); DWORD dwRc = RegQueryValueExW(HKEY_CURRENT_USER, L"Scratch", NULL, NULL, (BYTE*)sz, &cb); printf("RegQueryValueExW requesting %d bytes => %d\n", sizeof(sz), dwRc); printf("%d bytes required\n", cb); if (dwRc == ERROR_SUCCESS) { printf("sz[0] = %d\n", sz[0]); printf("sz[1] = %d\n", sz[1]); } RegDeleteValueW(HKEY_CURRENT_USER, L"Scratch"); return 0; } If you run this program, you get this: Size is 2 bytes RegQueryValueExW requesting 4 bytes => 0 2 bytes required sz[0] = 49 sz[1] = 65535 What happened? First, observe that the call to RegSetValueExW lies about the length of the string, claiming that it is two bytes long when in fact it is six! (Two WCHARs plus a terminator.) The registry dutifully records this lie and reports it back to subsequent callers. The first call to RegQueryValueExW asks how big the string is, and the registry reports the value 2, since that's the value it was given when the value was originally stored. To show that there really is no null terminator, we ask the registry to read those two bytes of data into our buffer, pre-filling the buffer with sentinel values so we can see what got updated and what didn't. Lo and behold, the values were read from the registry and only two bytes were read. sz[0] contains the character '1', and sz[1] remains uninitialized. This has security implications.
If your program assumes that strings in the registry are always
null-terminated, then you can be tricked into a buffer overflow
if you happen across a non-null-terminated string.
(For example, if you use (Note: I'm not going to get into whether it should have been possible to get into this state in the first place. I didn't design the registry. Arguing about the past isn't going to change the present, and the present is that this is how it works so you'd better be ready for it.) Exercise: Change the last parameter of RegSetValueExW to 3 and run the program again. Explain the results and discuss its consequences. |
Comments (25)
Comments are closed. |
Raymond: You’d better put some newlines in that program, so as to not put it all on one line.
I remember a registry, or more precistely a registry editor, from back in the Windows 3.1 days. What was it used for?
Windows should have used counted strings throughout its API. Internally (in kernel mode), it does. It should do externally too.
Win 95 doesn’t support REG_DWORD? I’ve used it in some of my programs without any problems…
"I remember a registry, or more precistely a registry editor, from back in the Windows 3.1 days. What was it used for? "
IIRC, the Windows 3.1 registry editor was used almost exclusively to store OLE 2.0 bindings and configuration information.
DrPizza: Please read the paragraph where I said that I’m not going to discuss whether the current design is good or not.
Ever since Win95 introduced the registry and the first time my registry got corrupted, I’ve hated it…
We moved from INI files [similar to Mac app settings going into the System folder or something like that] to a monolithic storage structure. NOTE That WinNT introduced ACLs for the keys, which in a sense made it cool, but still monolithic. [And I’m always curious why IIS used a METABASE when the registry was available… speed? probably… although the METABASE got out of sync too damn easy too]
So now that .NET apps are pushing for more XCOPY deployment, and IIS6 able to I wonder if the registry begins to take a back seat to Config files in Longhorn, relegating the Registry to just OS storage and not for the Apps?
Somewhat OT: Did you know that the hive guy is back?
Win9x registries didn’t get corrupted on systems where the hardware worked. Never happened to me in four years before I moved to NT 4.0 and has never happened on any of the NT-based systems either.
NT-based systems protect the registry data by, unsurprisingly, logging all changes before they’re made – it’s transactional, on both FAT32 and NTFS volumes. The log files can be found in the same directory as the hives themselves, in %SystemRoot%System32config, except your user profile’s registry settings which live in %USERPROFILE%NtUser.dat (for HKEY_CURRENT_USER) and %USERPROFILE%Local SettingsApplication DataMicrosoftWindowsUsrClass.dat (the user portion of HKEY_CLASSES_ROOT).
Frankly I feel a lot safer with the registry than arbitrary configuration files (not transactional), and it’s a lot more efficient. Unfortunately it’s harder to add new configuration data – you have to make API calls rather than dropping a file.
Nice article.
I understand you don’t want to discuss the history of this behavior or "whether the current design is good or not", but isn’t this something that could be fixed once and for all at the OS level in a future service pack / OS?
For example, couldn’t RegQueryValueEx be made to always append a null terminator on SZ-type data, and in cases where there isn’t enough room in the supplied buffer for one (as in your example), return ERROR_MORE_DATA?
That would break programs that intentionally store non-null-terminated strings in the registry, since they would write 4 non-null-terminated characters then try to read it back into a 4-character buffer and instead of succeeding they get ERROR_MORE_DATA back.
Programs that do this are far more common than you think. (You may have written one such inadvertently! It happens if you pass REG_SZ instead of REG_BINARY by mistake.)
// this copy assumes the presence of
// a null terminator; otherwise you have
// a buffer overflow.
TCHAR szOtherBuffer[256];
lstrcpy(szOtherBuffer, szBuffer);
note also dwBufferLength should be the size in bytes, not the size in characters.
The NT registry doesn’t really enforce the value sizes for any of the types AFAIK; for instance, you can have REG_DWORDs that aren’t actually 4 bytes.
So this problem isn’t necessarily limited to strings, just strings are more of an issue since Win32 apps assume them null terminated mostly.
I thought dword vs 4 byte binary was a inf limitation rather than a registry limitation.
Even if your program assumes that all registry strings are null terminated how can this cause a buffer overflow? RegQueryValueEx 6th parameter is a DWORD* to the size of your data. So if I’m reading in a string I have:
TCHAR szBuffer[256];
DWORD dwBufferLength = sizeof(szBuffer) / sizeof(TCHAR);
RegQueryValueEx(…, (BYTE*)szBuffer, &dwBufferLength);
This usually isn’t an issue when both the writer and reader of the value are the same app. However, it becomes a problem when the registry becomes an interface between app and OS. Such as, I’ve seen a lot of ‘hacks’ in the print spooler that take care of the strings to which a print driver forgot to append a null terminator. It’s just common.
Apart from the size issue, I’ve also seen a number of apps changing the data type of public registry values at will. Sometimes a REG_EXPAND_SZ is changed to REG_SZ, and something strange may happen (or maybe not).
I’m told, and I can’t remember where I saw this, that there are microsoft apps which embed a null into a registry value in order to hide some information. Regedit treats strings as being null terminated and so only shows the first part, but the entire string is there and hidden.
> Lo and behold, the values were read from the
> registry and only two bytes were read. sz[0]
> contains the character ‘1’, and sz[1]
> remains uninitialized.
Seriously, this is a relief. Unless I’m misreading your APL program[*], the second call to RegQueryValueExW specified that the buffer length is 2 bytes. If RegQueryValueExW changed the valye of sz[1] then RegQueryValueExW would be causing buffer overflows.
Your printf printed sizeof(sz), but your assignment only set cb = sizeof(sz[0]).
[* How many APL programmers does it take to change a light bulb? 35, and they all have to be on one line.]
"Such as, I’ve seen a lot of ‘hacks’ in the print spooler that take care of the strings to which a print driver forgot to append a null terminator."
Why doesn’t the print spooler read the size of the string first, and then the number of bytes up to the first null terminator? Looks safe and not like a hack to me.
This does lead to me to ask a sorta-OT question: what do you think of the design of the Windows API in general Raymond? I know you wrote some of it, but if you could design the API again with no backwards compatibility concerns, what would you do different?
Perhaps a less vague way of phrasing the same thing: If tomorrow you were hired by Red Hat to work on the desktop Linux APIs, what advice would you have w.r.t API design and implementation?
"DrPizza: Please read the paragraph where I said that I’m not going to discuss whether the current design is good or not. "
But I’m not talking about the design of the registry. I’m talking about the design of the rest of Win32, for it is the rest of Win32 whose design is poor. The kernel API doesn’t make that mistake, and on occasion that leaks out into the Win32 world. This is merely one of those occasions.
Why don’t you punch out a blog entry on what you think is wrong with Win32. It would make an interesting read.
"Why doesn’t the print spooler read the size of the string first, and then the number of bytes up to the first null terminator? Looks safe and not like a hack to me."
It just checks whether or not there is a null terminator at the end of the string, and if not adds one. This is extra code written for bad developers. Allowing such non-null-terminated strings leads to crashes in other places that do not have the check code. I have chased down some crashes due to this, and it’s usually difficult to chase because the crashes often happen after a strcpy that corrupts memory.
IMO, it should just not accept such strings at the beginning to force the driver developers fix their code before shipping the driver.
DrPizza: that’s unfair (and I’m a OS X bigot). Some parts are well designed, some parts are so-so designed, some parts are poorly designed. WinNT is big, lots of cooks in the kitchen, lots of fingers in the code.