Why does the documentation for ReadFile say that the lpNumberOfBytesRead parameter is optional when it is sometimes mandatory?

Date:October 26, 2016 / year-entry #225
Tags:code
Orig Link:https://blogs.msdn.microsoft.com/oldnewthing/20161026-00/?p=94585
Comments:    13
Summary:In the fine print, due to incomplete expressiveness.

The documentation for the Read­File function says

BOOL WINAPI ReadFile(
  _In_ HANDLE hFile,
  _Out_ LPVOID lpBuffer,
  _In_ DWORD nNumberOfBytesToRead,
  _Out_opt_ LPDWORD lpNumberOfBytesRead,
  _Inout_opt_ LPOVERLAPPED lpOverlapped
);

The lpNumber­Of­Bytes­Read parameter is declared as _Out_opt_. The _Out_ part means that the function writes to the pointed-to value. The _opt_ part means that the parameter is optional (may be null). And yet, if you call Read­File and pass nullptr for the fourth parameter, it crashes. What's going on here?

What's going on is in the fine print:

lpNumberOfBytesRead [out, optional]

A pointer to the variable that receives the number of bytes read when using a synchronous hFile parameter. Read­File sets this value to zero before doing any work or error checking. Use NULL for this parameter if this is an asynchronous operation to avoid potentially erroneous results.

This parameter can be NULL only when the lpOverlapped parameter is not NULL.

For more information, see the Remarks section.

Note that second paragraph.

The deal is that the fourth parameter is sometimes optional and sometimes mandatory. The simplified annotation language used by this function prototype cannot express this sort of conditional mandatory state, so it takes the most generous position of declaring the parameter as optional, so as to avoid raising false positives in static analysis tools.

A more precise annotation for that parameter would be something like _When_(lpOverlapped == NULL, _Out_opt_) _When_(lpOverlapped != NULL, _Out_).


Comments (13)
  1. Damien says:

    Except your condition and outcomes are the wrong way around (it should only be _opt_ when lpOverlapped != NULL)

  2. Lexx says:

    Shouldnt the function check for passed nullptr pointer and exit with invalid argument type error instead? Crashing, when there’s a simple way to check seems to be a lazy design at best.

  3. Mike Caron says:

    Lexx: Because the same people who violate this rule also aren’t going to handle this error condition correctly anyway. May as well crash rather than spend cycles on a situation that isn’t going to get better.

    Raymond: I swear you’ve published this exact article before. I feel like half the articles on this blog are “If lpOverlapped is null, you have to pass the out pointer”. I guess you can stop when people start respecting it!

  4. Killer{R} says:

    There should be separate API: existing ReadFileEx for async and ReadFile for sync operations. And lasy should have _In_opt_ LPLARGE_INTEGER lpOffset; instead of OVERLAPPED. Then there would be nor redundancy, nor misinterpretations.
    In overall it looks like it was designed so by some smart guy, but another one (not so smart) decided to ‘tweak’ initial design.

    1. Darran Rowe says:

      The problem with ReadFileEx for async operations is that for it to complete, the thread must have a completion routine and the thread must be in an alertable wait.
      Using ReadFile for an async handle allows it to not require a completion routine and wait on a handle instead. ReadFileEx explicitly mentions that it will not use the event parameter. So ReadFileEx isn’t a replacement for the functionality of ReadFile for pure async files, so there would have to be a third function to do what you want.

      1. Killer{R} says:

        No problem if CompletionRoutine for ReadFileEx would be optional, in which case it would behave just how async ReadFile now behaves

        1. Darran Rowe says:

          But that would then create a similar kind of confusion with ReadFileEx that currently exists with ReadFile.
          If you pass in a completion routine then you have to do an alertable wait for the read to complete. If you don’t pass in a completion routine, then the handle in the overlapped structure is used to signal the completion.
          There are probably ways you could retort to that, but there are some things to remember. First, the naming of these functions usually mean that ReadFile was available some time before ReadFileEx. So this would mean that async IO was available in Windows before IO completion ports. In fact, considering the fact that the same function was used, it would mean that async IO has been in Windows as long as the ReadFile function. From a bit of digging on the Win32 API, IO completion ports came in with Windows NT, but async IO was available with Win32s. So in general, history probably didn’t help here, especially since at that point Microsoft themselves weren’t against overloading a single function at that point.
          But to modify those functions now would break a lot of software, so right now it is impossible to do anything.

          1. Killer{R} says:

            Nope, confusion is not similar kind. OVERLAPPED will be used for overlapped operations. In case of synchronous ReadFile – OVERLAPPED is used by not-overlapped operation that’s complete abuse of that concept.
            Also that would be more consistent with other OVERLAPPED + CompletionRoutine usecases: ReadDirectoryChangesW/WSARecv/WSASend and may be something else.
            Of course ReadFileEx not a best name for such design. ReadFileAsync would be better.
            And of course now its only theoretical speculations. No chance to do such major changes in that API now. Actually no chance to do ANY chances in that API now.

          2. laonianren says:

            IO completion ports were actually introduced in NT 3.5. They weren’t in 3.1.

            Somewhat absurdly, I still think of IO completion ports as being a newish feature because I remember them being introduced.

          3. Darran Rowe says:

            @Killer{R}:
            Yup overloading like that is an abuse of the concept, but as I said, Microsoft hasn’t been afraid of doing things like that. A bigger example of this is one of the most fundamental functions for Windows, CreateWindow.
            For CreateWindow, the HMENU parameter also serves as the child item ID. What’s worse is that when you are creating a child window, you have to cast the ID to a HMENU. While it does require abuse of these functions, it does end up simplifying things. Only CreateWindow is needed, no CreateTopLevelWindow and CreateChildWindow.
            So this all subjective, but in the end Microsoft chose complexity of the API over a larger amount of functions. If they had split out the IO functionality into something like ReadFile and ReadFileAsync, when IO completion ports were added, they would have had to add another function, like they added ReadFileEx anyway.
            @laonianren:
            Interesting, Windows NT 3.1 and 3.5 were actually slightly before my time. I didn’t actually start getting into computers and programming until 1995 or so. I only really knew Windows 95 and Windows NT 4 onwards.

  5. alegr1 says:

    >This parameter can be NULL only when the lpOverlapped parameter is not NULL.

    This parameter can only be NULL ONLY if the ReadFile is guaranteed to complete asynchronously (always return false with ERROR_IO_PENDING). If it ever returns data synchronously, lpNumberOfBytesRead is then mandatory.

    1. MarcK4096 says:

      It seems like it would be better to just update the documentation to make the parameter mandatory all of the time. There shouldn’t be fine print like this on API calls. “_opt” should mean optional all of the time.

  6. Medinoc says:

    The most “fun” part I’ve seen with annotations is DrawText(), which gives a warning when you pass it an LPCTSTR… because one of its flags (DT_MODIFYSTRING) will treat the pointer as non-const.

Comments are closed.


*DISCLAIMER: I DO NOT OWN THIS CONTENT. If you are the owner and would like it removed, please contact me. The content herein is an archived reproduction of entries from Raymond Chen's "Old New Thing" Blog (most recent link is here). It may have slight formatting modifications for consistency and to improve readability.

WHY DID I DUPLICATE THIS CONTENT HERE? Let me first say this site has never had anything to sell and has never shown ads of any kind. I have nothing monetarily to gain by duplicating content here. Because I had made my own local copy of this content throughout the years, for ease of using tools like grep, I decided to put it online after I discovered some of the original content previously and publicly available, had disappeared approximately early to mid 2019. At the same time, I present the content in an easily accessible theme-agnostic way.

The information provided by Raymond's blog is, for all practical purposes, more authoritative on Windows Development than Microsoft's own MSDN documentation and should be considered supplemental reading to that documentation. The wealth of missing details provided by this blog that Microsoft could not or did not document about Windows over the years is vital enough, many would agree an online "backup" of these details is a necessary endeavor. Specifics include:

<-- Back to Old New Thing Archive Index