Beware of non-null-terminated registry strings

Date:August 24, 2004 / year-entry #317
Tags:code
Orig Link:https://blogs.msdn.microsoft.com/oldnewthing/20040824-00/?p=38063
Comments:    25
Summary:Even though a value is stored in the registry as REG_SZ, this doesn't mean that the value actually ends with a proper null terminator. At the bottom, the registry is just a hierarchically-organized name/value database. And you can lie and get away with it. Lots of people lie about their registry data. You'll find lots...

Even though a value is stored in the registry as REG_SZ, this doesn't mean that the value actually ends with a proper null terminator. At the bottom, the registry is just a hierarchically-organized name/value database.

And you can lie and get away with it.

Lots of people lie about their registry data. You'll find lots of things that should be REG_DWORD stored as a four-byte REG_BINARY. (This is in part a holdover from Windows 95's registry, which didn't support REG_DWORD.)

One of the most insidious lies is to lie about the length of a string you're writing to the registry. Consider the following program:

#include <windows.h>
#include <stdio.h>

int __cdecl main(int argc, char **argv)
{
    RegSetValueExW(HKEY_CURRENT_USER, L"Scratch",
                   0, REG_SZ, (BYTE*)L"12", 2);

    DWORD cb = 0;
    RegQueryValueExW(HKEY_CURRENT_USER, L"Scratch",
                     NULL, NULL, NULL, &cb);
    printf("Size is %d bytes\n", cb);

    WCHAR sz[2];
    sz[0] = 0xFFFF;
    sz[1] = 0xFFFF;
    cb = sizeof(sz[0]);
    DWORD dwRc = RegQueryValueExW(HKEY_CURRENT_USER, L"Scratch",
                                  NULL, NULL, (BYTE*)sz, &cb);
    printf("RegQueryValueExW requesting %d bytes => %d\n",
           sizeof(sz), dwRc);
    printf("%d bytes required\n", cb);
    if (dwRc == ERROR_SUCCESS) {
        printf("sz[0] = %d\n", sz[0]);
        printf("sz[1] = %d\n", sz[1]);
    }

    RegDeleteValueW(HKEY_CURRENT_USER, L"Scratch");

    return 0;
}

If you run this program, you get this:

Size is 2 bytes
RegQueryValueExW requesting 4 bytes => 0
2 bytes required
sz[0] = 49
sz[1] = 65535

What happened?

First, observe that the call to RegSetValueExW lies about the length of the string, claiming that it is two bytes long when in fact it is six! (Two WCHARs plus a terminator.)

The registry dutifully records this lie and reports it back to subsequent callers.

The first call to RegQueryValueExW asks how big the string is, and the registry reports the value 2, since that's the value it was given when the value was originally stored.

To show that there really is no null terminator, we ask the registry to read those two bytes of data into our buffer, pre-filling the buffer with sentinel values so we can see what got updated and what didn't.

Lo and behold, the values were read from the registry and only two bytes were read. sz[0] contains the character '1', and sz[1] remains uninitialized.

This has security implications.

If your program assumes that strings in the registry are always null-terminated, then you can be tricked into a buffer overflow if you happen across a non-null-terminated string. (For example, if you use strcpy to copy it around.)

(Note: I'm not going to get into whether it should have been possible to get into this state in the first place. I didn't design the registry. Arguing about the past isn't going to change the present, and the present is that this is how it works so you'd better be ready for it.)

Exercise: Change the last parameter of RegSetValueExW to 3 and run the program again. Explain the results and discuss its consequences.


Comments (25)
  1. Raymond: You’d better put some newlines in that program, so as to not put it all on one line.

  2. I remember a registry, or more precistely a registry editor, from back in the Windows 3.1 days. What was it used for?

  3. DrPizza says:

    Windows should have used counted strings throughout its API. Internally (in kernel mode), it does. It should do externally too.

  4. Jonathan says:

    Win 95 doesn’t support REG_DWORD? I’ve used it in some of my programs without any problems…

  5. mschaef says:

    "I remember a registry, or more precistely a registry editor, from back in the Windows 3.1 days. What was it used for? "

    IIRC, the Windows 3.1 registry editor was used almost exclusively to store OLE 2.0 bindings and configuration information.

  6. Raymond Chen says:

    DrPizza: Please read the paragraph where I said that I’m not going to discuss whether the current design is good or not.

  7. Eric Newton says:

    Ever since Win95 introduced the registry and the first time my registry got corrupted, I’ve hated it…

    We moved from INI files [similar to Mac app settings going into the System folder or something like that] to a monolithic storage structure. NOTE That WinNT introduced ACLs for the keys, which in a sense made it cool, but still monolithic. [And I’m always curious why IIS used a METABASE when the registry was available… speed? probably… although the METABASE got out of sync too damn easy too]

    So now that .NET apps are pushing for more XCOPY deployment, and IIS6 able to I wonder if the registry begins to take a back seat to Config files in Longhorn, relegating the Registry to just OS storage and not for the Apps?

  8. Somewhat OT: Did you know that the hive guy is back?

  9. Carlos says:

    Change the last parameter of RegSetValueExW to 3

    You always ask for two bytes (cb = sizeof(sz[0]);), but that’s not enough so RegQueryValueExW will return ERROR_MORE_DATA.

    If you had used a big enough buffer of bytes and you attempted to terminate the string by appending two zero bytes to the buffer, then you’d fail because the zeroes would straddle a wide character boundary. I’m slightly worried that I’ve written code that does this…

  10. Mike Dimmick says:

    Win9x registries didn’t get corrupted on systems where the hardware worked. Never happened to me in four years before I moved to NT 4.0 and has never happened on any of the NT-based systems either.

    NT-based systems protect the registry data by, unsurprisingly, logging all changes before they’re made – it’s transactional, on both FAT32 and NTFS volumes. The log files can be found in the same directory as the hives themselves, in %SystemRoot%System32config, except your user profile’s registry settings which live in %USERPROFILE%NtUser.dat (for HKEY_CURRENT_USER) and %USERPROFILE%Local SettingsApplication DataMicrosoftWindowsUsrClass.dat (the user portion of HKEY_CLASSES_ROOT).

    Frankly I feel a lot safer with the registry than arbitrary configuration files (not transactional), and it’s a lot more efficient. Unfortunately it’s harder to add new configuration data – you have to make API calls rather than dropping a file.

  11. Jordan Russell says:

    Nice article.

    I understand you don’t want to discuss the history of this behavior or "whether the current design is good or not", but isn’t this something that could be fixed once and for all at the OS level in a future service pack / OS?

    For example, couldn’t RegQueryValueEx be made to always append a null terminator on SZ-type data, and in cases where there isn’t enough room in the supplied buffer for one (as in your example), return ERROR_MORE_DATA?

  12. Raymond Chen says:

    That would break programs that intentionally store non-null-terminated strings in the registry, since they would write 4 non-null-terminated characters then try to read it back into a 4-character buffer and instead of succeeding they get ERROR_MORE_DATA back.

    Programs that do this are far more common than you think. (You may have written one such inadvertently! It happens if you pass REG_SZ instead of REG_BINARY by mistake.)

  13. Raymond Chen says:

    // this copy assumes the presence of

    // a null terminator; otherwise you have

    // a buffer overflow.

    TCHAR szOtherBuffer[256];

    lstrcpy(szOtherBuffer, szBuffer);

    note also dwBufferLength should be the size in bytes, not the size in characters.

  14. Skywing says:

    The NT registry doesn’t really enforce the value sizes for any of the types AFAIK; for instance, you can have REG_DWORDs that aren’t actually 4 bytes.

    So this problem isn’t necessarily limited to strings, just strings are more of an issue since Win32 apps assume them null terminated mostly.

  15. David Candy says:

    I thought dword vs 4 byte binary was a inf limitation rather than a registry limitation.

  16. Gary says:

    Even if your program assumes that all registry strings are null terminated how can this cause a buffer overflow? RegQueryValueEx 6th parameter is a DWORD* to the size of your data. So if I’m reading in a string I have:

    TCHAR szBuffer[256];

    DWORD dwBufferLength = sizeof(szBuffer) / sizeof(TCHAR);

    RegQueryValueEx(…, (BYTE*)szBuffer, &dwBufferLength);

  17. timchen says:

    This usually isn’t an issue when both the writer and reader of the value are the same app. However, it becomes a problem when the registry becomes an interface between app and OS. Such as, I’ve seen a lot of ‘hacks’ in the print spooler that take care of the strings to which a print driver forgot to append a null terminator. It’s just common.

    Apart from the size issue, I’ve also seen a number of apps changing the data type of public registry values at will. Sometimes a REG_EXPAND_SZ is changed to REG_SZ, and something strange may happen (or maybe not).

  18. Tom M says:

    I’m told, and I can’t remember where I saw this, that there are microsoft apps which embed a null into a registry value in order to hide some information. Regedit treats strings as being null terminated and so only shows the first part, but the entire string is there and hidden.

  19. Norman Diamond says:

    > Lo and behold, the values were read from the

    > registry and only two bytes were read. sz[0]

    > contains the character ‘1’, and sz[1]

    > remains uninitialized.

    Seriously, this is a relief. Unless I’m misreading your APL program[*], the second call to RegQueryValueExW specified that the buffer length is 2 bytes. If RegQueryValueExW changed the valye of sz[1] then RegQueryValueExW would be causing buffer overflows.

    Your printf printed sizeof(sz), but your assignment only set cb = sizeof(sz[0]).

    [* How many APL programmers does it take to change a light bulb? 35, and they all have to be on one line.]

  20. Frederik Slijkerman says:

    "Such as, I’ve seen a lot of ‘hacks’ in the print spooler that take care of the strings to which a print driver forgot to append a null terminator."

    Why doesn’t the print spooler read the size of the string first, and then the number of bytes up to the first null terminator? Looks safe and not like a hack to me.

  21. Mike says:

    This does lead to me to ask a sorta-OT question: what do you think of the design of the Windows API in general Raymond? I know you wrote some of it, but if you could design the API again with no backwards compatibility concerns, what would you do different?

    Perhaps a less vague way of phrasing the same thing: If tomorrow you were hired by Red Hat to work on the desktop Linux APIs, what advice would you have w.r.t API design and implementation?

  22. DrPizza says:

    "DrPizza: Please read the paragraph where I said that I’m not going to discuss whether the current design is good or not. "

    But I’m not talking about the design of the registry. I’m talking about the design of the rest of Win32, for it is the rest of Win32 whose design is poor. The kernel API doesn’t make that mistake, and on occasion that leaks out into the Win32 world. This is merely one of those occasions.

  23. Manip says:

    Why don’t you punch out a blog entry on what you think is wrong with Win32. It would make an interesting read.

  24. timchen says:

    "Why doesn’t the print spooler read the size of the string first, and then the number of bytes up to the first null terminator? Looks safe and not like a hack to me."

    It just checks whether or not there is a null terminator at the end of the string, and if not adds one. This is extra code written for bad developers. Allowing such non-null-terminated strings leads to crashes in other places that do not have the check code. I have chased down some crashes due to this, and it’s usually difficult to chase because the crashes often happen after a strcpy that corrupts memory.

    IMO, it should just not accept such strings at the beginning to force the driver developers fix their code before shipping the driver.

  25. Chickenman says:

    DrPizza: that’s unfair (and I’m a OS X bigot). Some parts are well designed, some parts are so-so designed, some parts are poorly designed. WinNT is big, lots of cooks in the kitchen, lots of fingers in the code.

Comments are closed.


*DISCLAIMER: I DO NOT OWN THIS CONTENT. If you are the owner and would like it removed, please contact me. The content herein is an archived reproduction of entries from Raymond Chen's "Old New Thing" Blog (most recent link is here). It may have slight formatting modifications for consistency and to improve readability.

WHY DID I DUPLICATE THIS CONTENT HERE? Let me first say this site has never had anything to sell and has never shown ads of any kind. I have nothing monetarily to gain by duplicating content here. Because I had made my own local copy of this content throughout the years, for ease of using tools like grep, I decided to put it online after I discovered some of the original content previously and publicly available, had disappeared approximately early to mid 2019. At the same time, I present the content in an easily accessible theme-agnostic way.

The information provided by Raymond's blog is, for all practical purposes, more authoritative on Windows Development than Microsoft's own MSDN documentation and should be considered supplemental reading to that documentation. The wealth of missing details provided by this blog that Microsoft could not or did not document about Windows over the years is vital enough, many would agree an online "backup" of these details is a necessary endeavor. Specifics include:

<-- Back to Old New Thing Archive Index