The SetFilePointer
function reports an error in two different ways, depending on whether you passed NULL
as the lpDistanceToMoveHigh
parameter. The documentation in MSDN is correct, but I've discovered that people prefer when I restate the same facts in a different way, so here comes the tabular version of the documentation.
|
If lpDistanceToMoveHigh == NULL |
If lpDistanceToMoveHigh != NULL |
If success |
retVal != INVALID_SET_FILE_POINTER
|
retVal != INVALID_SET_FILE_POINTER || GetLastError() == ERROR_SUCCESS |
If failed |
retVal == INVALID_SET_FILE_POINTER
|
retVal == INVALID_SET_FILE_POINTER && GetLastError() != ERROR_SUCCESS |
I'd show some sample code, but the documentation in MSDN already contains sample code both for the lpDistancetoMoveHigh == NULL
case as well as the lpDistancetoMoveHigh != NULL
case.
A common mistake is calling GetLastError
even if the return value is not INVALID_SET_FILE_POINTER
. In other words, people ignore the whole retVal == INVALID_SET_FILE_POINTER
part of the "did the function succeed or fail?" test. Just because GetLastError()
returned an error code doesn't mean that the SetFilePointer
function failed. The return value must also have been INVALID_SET_FILE_POINTER
. I will admit that the documentation in MSDN could be clearer on this point, but the sample code hopefully resolves any lingering ambiguity.
But why does SetFilePointer
use such a wacky way of reporting errors when lpDistanceToMoveHigh
is non-NULL
? The MSDN documentation also explains this detail: If the file size is greater than 4GB, then INVALID_SET_FILE_POINTER
is a valid value for the low-order 32 bits of the file position. For example, if you moved the pointer to position 0x00000001`FFFFFFFF, then *lpDistanceToMoveHigh
will be set to the high-order 32 bits of the result (1), and the return value is the low-order 32 bits of the result (0xFFFFFFFF, which happens to be the numerical value of INVALID_SET_FILE_POINTER
). In that case (and only in that case) does the system need to use SetLastError(ERROR_SUCCESS)
to tell you, "No, that value is perfectly fine. It's just a coincidence that it happens to be equal to INVALID_SET_FILE_POINTER
".
Why not call SetLastError(ERROR_SUCCESS)
on all success paths, and not just the ones where the low-order 32 bits of the result happen to be 0xFFFFFFFF? That's just a general convention of Win32: If a function succeeds, it is not required to call SetLastError(ERROR_SUCESS)
. The success return value tells you that the function succeeded. The exception to this convention is if the return value is ambiguous, as we have here when the low-order 32 bits of the result happen to be 0xFFFFFFFF.
You might argue that this was a stupid convention, But what's done is done and until time travel has been perfected, you just have to live with the past. (Mind you, UNIX uses the same convention with the errno
variable. Only if the previous function call failed is the value of errno
defined.)
Looking back on it, the designers of SetFilePointer
were being a bit too clever. They tried to merge 32-bit and 64-bit file management into a single function. "It's generic!" The problem with this is that you have to check for errors in two different ways depending on whether you were using the 32-bit variation or the 64-bit variation. Fortunately, the kernel folks realized that their cleverness backfired and they came up with a new function, SetFilePointerEx
. That function produces a 64-bit value directly, and the return value is a simple BOOL
, which makes checking for success or failure a snap.
Exercise: What's the deal with the GetFileSize
function?
GetFileSize does the same thing, for the same reason. It just returns the high and low values rather than taking in the high and low values.
On Unix — it’s true that most (perhaps all?) Unix system calls will keep errno set to whatever it was before if they succeed. But all the system calls I’ve ever used have an unambiguous way of signaling success or failure, too. ;-)
(It helped that when the large-file spec was added to POSIX (or whatever standard), it simply typedef’ed off_t to off64_t, and added new functions where needed. Most of the seek functions take and return an off_t (and the top bit of an off_t can never be set, so if the seek functions return a value less than zero, they failed). So if your program defines _FILE_OFFSET_BITS=64, then you get a 64-bit off_t (and the docs say you have to make sure all your libraries support it too), and the other file functions are redefined to be ones that handle the correct offset size. If you don’t set that define, then you get the old 32-bit off_t.)
That logic is a bug waiting to happen.
If you did a SetLastError(ERROR_SUCESS) when an error did not occur would that potentially mask an important error that you wanted to find out about?
I think your table is wrong.
If lpDistanceToMoveHigh == NULL and retVal == INVALID_SET_FILE_POINTER then the function hasn’t necessarily failed. You need to call GetLastError.
For example, suppose the file offset is 0xfffffffe and you move forward 1 byte. The return value is 0xffffffff but the function didn’t fail.
In this context, how does it matter how any Unix does it? Is that way "the right way", and give us the ability to time-travel to "fix" Win32?
Thinking about it a little more, shouldn’t the test be:
BOOL succeeded = (retVal != INVALID_SET_FILE_POINTER) || (GetLastError() == NO_ERROR);
It doesn’t depend on lpDistanceToMoveHigh.
RE: BryanK
Alas, Unix isn’t always consistent — there are some functions which have a
void' return value which requires checking
errno’ to determine if an error occurred.http://www.jprl.com/Blog/archive/development/mono/2007/Jun-29.html
Granted, these aren’t system calls, but they’re still C library calls defined by POSIX. It’s unfortunate that some functions have this error reporting mechanism.
"So is that deemed a success case?"
Yeah, what’s wrong with that? I’m really not seeing anything confusing or contradictory about what you’ve pasted for GetWindowsDirectory. A function can define success or failure however it wants, and in this case success isn’t defined by whether it was able to put the path into your buffer. I would suppose that this approach simplifies the function for the caller and minimizes the number of parameters, in addition to reserving failure cases for real failures like permission problems or something.
"If the function succeeds [in finding the path to the Windows directory], the return value is the length of the string copied to the buffer, in TCHARs, not including the terminating null character [, but only if the buffer was large enough to hold the entire string]. If the length is greater than the size of the buffer, the return value is the size of the buffer required to hold the path. If the function fails [to find the path to the Windows directory], the return value is zero."
"Just because GetLastError() returned an error code doesn’t mean that the SetFilePointer function failed."
But if GetLastError() returned NO_ERROR, then it does mean the SetFilePointer function succeeded, right? Or does this function not SetLastError() if it fails when lpDistanceToMoveHigh == NULL?
The atoi function from the C standard library has to be one of the worst offenders for overloading valid results with error codes.
It takes a string and returns an integer. If it cannot convert the string to an integer it returns zero.
There is no good way to check whether the input string represents zero or couldn’t be converted without having to re-create some of the logic of atoi (skipping blanks, handling the optional sign indicator).
SetFilePointer and GetFileSize are bad and unfortunate designs but at least they let you work out errors from successes in an easy and reliable way.
Jonathan Pryor: Ouch, that is a PITA; . I guess I just haven’t run into any of those functions yet, then. :-) (Certainly I’ve never used setgrent().)
Yikes.
@JS:
The problem is that if GetLastError() returns some failure code, you can’t be sure that SetFilePointer() is what set the failure code unless the returned value is INVALID_SET_FILE_POINTER.
However, taking your point, the API sets the error code if it fails regardless of whether or not lpDistanceToMoveHigh is NULL. So you can test for failure using:
BOOL failed = (result == INVALID_SET_FILE_POINTER) && (GetLastError() != ERROR_SUCCESS);
*Note that this is a corollary to Psa’s comment that success can be determined in either case using:
BOOL succeeded = (retVal != INVALID_SET_FILE_POINTER) || (GetLastError() == NO_ERROR);
I’d suggest using one of these tests for any use of SetFilePointer() – using 32 or 64 bit offsets – so you don’t have to worry about coding a nightmare logic puzzle.
Or just move on over to SetFilePointerEx().
“But then again, if you’re using only 32-bit offsets and you seek to position 0xFFFFFFFF you’re on the verge of losing anyway.”
Only if you care about the value of the new file position, which you often don’t if you’re processing a file sequentially.
For example, suppose you’re processing a file that contains a sequence of sections, each section containing a header and variable length data. Like some compression formats or media files. In each section the size of the variable length data is stored using 32 bits in the header. You write the obvious code to skip through the file looking for particular records, and you use the error test recommended in this post. The program fails when a section happens to start at 0xFFFFFFFF. Get the error test right and it will reliably process files of any size.
"SetFilePointer will not let you move the file pointer past 0xFFFFFFFF when lpDistanceToMoveHigh is NULL"
This isn’t an obvious restriction. It really ought to be documented.
“Well, actually you still can’t process files >= 4GB since SetFilePointer will not let you move the file pointer past 0xFFFFFFFF when lpDistanceToMoveHgih is NULL.”
It will if you pass FILE_CURRENT or FILE_END in the dwMoveMethod parameter.
e.g. if you do:
SetFilePointer(hFile, 0, NULL, FILE_END);
and the file is currently 0xFFFFFFFF bytes in size, you’ll get back 0xFFFFFFFF even though the function hasn’t failed.
Tangentially, I was recently emailed by a guy for whom Windows Installer completely stopped working. Every time he tried to install a product he got an error 2932.
Looking that up in the documentation revealed that the corresponding message is "Could not create file [2] from script data. Error: [3]". I guessed that the last parameter is a Windows error code and asked him to get a verbose log so we could see what it was.
It turned out to be error 131, ERROR_NEGATIVE_SEEK. The only APIs documented to return this value are SetFilePointer and family: it means that the file pointer that would result from this call would be before the beginning of the file, which isn’t allowed. (Setting the file pointer beyond the end of the file isn’t a problem: if you do this, the next write you make is written to that location and the ‘hole’ between the start of the file and the file pointer is filled with zeros.)
I couldn’t believe that Microsoft would make such a coding error that a) always failed on his machine, regardless of the package being installed, and yet b) worked fine on everyone else’s system. I therefore guessed that it was probably a rogue file system filter driver (e.g. an on-access anti-virus scanner). I don’t know if my guess was right, because he hasn’t responded.
Return value scenarios like this are bugs waiting to happen, no matter how much you explain them. That’s *because* of how much you *have* to explain them.
It reminds me of GetWindowsDirectory: "If the function succeeds, the return value is the length of the string copied to the buffer, in TCHARs, not including the terminating null character." Then in the VERY NEXT SENTENCE it backtracks with "If the length is greater than the size of the buffer, the return value is the size of the buffer required to hold the path." So is that deemed a success case? It must be, because the *next* sentence says "If the function fails, the return value is zero."
>
It’s quite obvious: SetFilePointer returns the new file pointer value. If lpDistanceToMove is NULL, then it can’t return the new file pointer if it’s > 0xFFFFFFFF.
>
That’s answered by Mike’s comment: negative file pointers are not allowed. So 0xFFFFFFFF is not a valid file pointer if lpDistanceToMove is NULL.
>
According to the documentation of SetFilePointer:
===================
If the SetFilePointer function succeeds and lpDistanceToMoveHigh is NULL, the return value is the low-order DWORD of the new file pointer.
===================
This implies that SetFilePointer() will move the file pointer past 0xFFFFFFFF even when lpDistanceToMoveHigh is NULL – you simply lose the high order half of the 64-bit file pointer in the result.
It doesn’t imply that to me… it implies that if setFilePointer will move the file pointer past 0xffffffff when lpDistanceToMoveHigh is NULL, then all you get is the low order word. (actually, it states that.) That doesn’t imply that SetFilePointer is going to be willing to do that.
Or, boiled down a lot, (a => b) !=> a
mikeb: I assume that this is an extra restriction (not fully documented) that reduces the incidence of wraparound bugs (since for both 0xFFFFFFFF + 1 and position 0 the return value of SetFilePointer is 0). Programs that don’t pass a valid lpDistanceToMoveHigh cannot cope with files over 4GB.
Dean Harding: the return value is unsigned. 0xFFFFFFFF is a valid return value for SetFilePointer (it represents an offset of 4GB minus one byte). What you cannot do is call SetFilePointer with FILE_CURRENT and a negative value larger than the current file pointer, with FILE_BEGIN and any negative value, or FILE_END and a negative value larger than the size of the file.
You cannot seek more than 2GB forwards (in one call) without specifying lpDistanceToMoveHigh – the lDistanceToMove value *is* treated as signed in this case.
>> It doesn’t imply that to me… it implies that _if_ setFilePointer _will_ move the file pointer past 0xffffffff when lpDistanceToMoveHigh is NULL, _then_ all you get is the low order word. (actually, it states that.) That doesn’t imply that SetFilePointer is going to be willing to do that. <<
That’s 100% correct, but you can say the same for file offsets below 0xffffffff. SetFilePointer’s documentation makes no promise to work even for offsets below 0xffffffff (it could be written to always return a failure regardless of the parameters and still be strictly following it’s documentation – it wouldn’t be very useful, but it wouldn’t be breaking it documented claims).
However, empirical testing shows that SetFilePointer() behaves as Raymond and you describe (not that it would have to, though). But, the fact that it will *not* move the file pointer past 0xffffffff if lpDistanceToMoveHigh is NULL just happens to be undocumented behavior.
So, I think that Psa’s statement that "This isn’t an obvious restriction. It really ought to be documented." has some merit. Not that I really expect or need to see a doc change, I just think that Psa’s point should not have been dismissed as something that was ‘quite obvious’.
Does SetFilePointerEx have the same problems with 2^64-1 byte files?
Can SetFilePointerEx can only seek 2^63 bytes forward?
They’re everywhere now.