Date: | September 19, 2007 / year-entry #350 |
Tags: | code |
Orig Link: | https://blogs.msdn.microsoft.com/oldnewthing/20070919-00/?p=25063 |
Comments: | 26 |
Summary: | Many functions accept a source string that consists of both a pointer and a length. And if you pass a length that is greater than the length of the string, the result depends on the function itself. Some of those functions, when given a string and a length, will stop either when the length is... |
Many functions accept a source string that consists of both a pointer and a length. And if you pass a length that is greater than the length of the string, the result depends on the function itself.
Some of those functions, when given a string and a length, will stop
either when the length is exhausted or a null terminator is reached
whichever comes first.
For example, if you pass a
On the other hand, many other functions (particularly those in the
NLS family)
will cheerfully operate past a null character if you
ask them to.
The idea here is that since you passed an explicit size,
you're consciously operating on a buffer which might
contain embedded null characters.
Because, after all, if you passed an explicit source size,
you really meant it, right?
(Maybe you're operating on a
I've seen programs crash because they thought that functions
like // buggy code - see discussion void someFunction(char *pszFile) { CharUpperBuff(pszFile, MAX_PATH); ... do something with pszFile ... } void Caller() { char buffer[80]; sprintf(buffer, "file%d", get_fileNumber()); someFunction(buffer); }
The intent here was for Critique, then, this replacement function: // buggy code - do not use int invariant_strnicmp(char *s1, char *s2, size_t n) { // [Update: 9:30am - typo fixed] return CompareStringA(LOCALE_INVARIANT, NORM_IGNORECASE, s1, n, s2, n) - CSTR_EQUAL; } (Michael Kaplan has one answer different from the one I was looking for.) |
Comments (26)
|
The problem is that the length of one or both of the strings may not actually be ‘n’.
If the strings are equal but shorter than n, the result will depend on the junk past the end of the strings.
Interestingly, the documentation for CompareString does not say explicitly what will happen for nulls in the strings if you pass a non-negative length value. I assume it will treat the string as a buffer that can contain nulls. Then, the code should be:
return CompareStringA(LOCALE_INVARIANT, NORM_IGNORECASE,
s1,
min(n, strlen(s1)),
s2,
min(n, strlen(s2))) – CSTR_EQUAL;
with
int min(int a, int b)
{
return (a < b) ? a : b;
}
Don’t go blindly looking for the NUL character, that will cause crashes too. (i.e. the strlen)
Often, when you use methods that have a length parameter, you are dealing with strings that don’t have NUL termination (instead of using the length for substring extraction). For example, if you have ever used EXPAT, many of the strings coming from the notification routines are pointing directly to the parse buffer and don’t contain the terminating NUL, thus the supplied length parameter. I dealt with many a crashing program that assumed the strings were NUL terminated.
Just curious, what does an uppercase NUL look like? :-)
In ASCII, an uppercase NUL is represented as (NUL & ~0x20).
@Ray
No that is the upper case.
Lower case looks like this: nul
Upper case looks like this: NUL
Don’t blame yourself for being confused. Failure to distinguish NUL from nul is the third most common programming fault after the fencepost error.
-Wang-Lo;)
Is it cheating if I answer? I did resist for just about the whole morning. :-)
One could always shove an Æ or an æ in one of the strings and watch the fireworks….
[quote]
// buggy code – see discussion
void someFunction(char *pszFile)
{
CharUpperBuff(pszFile, MAX_PATH);
… do something with pszFile …
}
[/quote]
You’re kidding. No one who writes C code would ever write that.
"You’re kidding. No one who writes C code would ever write that."
Unfortunately, the world is oversupplied with programmers who regularly write code that no intelligent person in their right mind would write, and C has more than its share.
Well, with the content of the rest of the article seems to be a hint. And a (very brief) look at an MSDN article on CompareString* turned up the following interesting phrase.
"Normally, for case-insensitive comparison, this function maps the lowercase "i" to the uppercase "I", "
So my first guess would be that CompareStringA, when handed the NORM_IGNORECASE flag calls CharUpperBuff or similar function.**
So the broken example function, if 1) asked to compare strings of non-equal length, or 2) given a length in excess of the actual string size, would probably happily perform the data corruption described in this article.
*I keep looking to find the one on CompareStringA; so this might all be wrong
**And there is probably someplace I should know about that explicitly states exactly how this flag is handled.
"You’re kidding. No one who writes C code would ever write that."
If it can compile then it’s been done.
NUL is an old shorthand for the ASCII character ‘