When you create an object with constraints, you have to make sure everybody who uses the object understands those constraints

Date:April 14, 2010 / year-entry #112
Tags:code
Orig Link:https://blogs.msdn.microsoft.com/oldnewthing/20100414-00/?p=14333
Comments:    19
Summary:Here's a question that came from a customer. This particular example involves managed code, but don't let that distract you from the point of the exercise. I am trying to create a FileStream object using the constructor that takes an IntPtr as input. In my .cs file, I create the native file handle using CreateFile,...

Here's a question that came from a customer. This particular example involves managed code, but don't let that distract you from the point of the exercise.

I am trying to create a FileStream object using the constructor that takes an IntPtr as input. In my .cs file, I create the native file handle using CreateFile, as shown below.

[DllImport("kernel32.dll", CharSet = CharSet.Auto, SetLastError = true)]
internal static extern IntPtr CreateFile(string lpFileName,
    int dwDesiredAccess, FileShare dwShareMode,
    IntPtr securityAttrs, FileMode dwCreationDisposition,
    UInt32 dwFlagsAndAttributes, IntPtr hTemplateFile);

IntPtr ptr1 = Win32Native.CreateFile(FileName, 0x40000000, 
         System.IO.FileShare.Read | System.IO.FileShare.Write, 
         Win32Native.NULL, 
         System.IO.FileMode.Create, 
         0xa0000000, // FILE_FLAG_NO_BUFFERING | FILE_FLAG_WRITE_THROUGH
         Win32Native.NULL); 

Then I create the File­Stream object as so:

FileStream fs = new FileStream(ptr1, FileAccess.Write, true, 1, false);

The fs gets created fine. But when I try to do:

fs.Write(msg, 0, msg.Length);
fs.Flush();

it fails with the error "IO operation will not work. Most likely the file will become too long or the handle was not opened to support synchronous IO operations."

int hr = System.Runtime.InteropServices.Marshal.GetHRForException(e)

Gives hr as COR_E_IO (0x80131620).

The stack trace is as below.

System.IO.IOException: IO operation will not work. Most likely
    the file will become too long or the handle was not opened
    to support synchronous IO operations.
at System.IO.FileStream.WriteCore(Byte[] buffer, Int32 offset, Int32 count)
at System.IO.FileStream.FlushWrite()
at System.IO.FileStream.Flush()
at PInvoke.Program.Main(String[] args)

Can somebody point out what might be going wrong?

(For those who would prefer to cover their ears and hum when the topic of managed code arises, change FileStream to fdopen.)

The comment on the line

         0xa0000000, // FILE_FLAG_NO_BUFFERING | FILE_FLAG_WRITE_THROUGH

was provided by the customer, and that's the key to the problem. It was right there in the comment, but the customer didn't understand the consequences.

As the documentation for Create­File notes, the FILE_FLAG_NO_BUFFERING flag requires that all I/O operations on the file handle be in multiples of the sector size, and that the I/O buffers also be aligned on addresses which are multiples of the sector size.

Since you created the file handle with very specific rules for usage, you have to make sure that everybody who uses it actually follows those rules. On the other hand, the File­Stream object doesn't know about these rules. It just figures you gave it a handle that it can issue normal synchronous Read­File and Write­File calls on. It doesn't know that you gave it a handle that requires special treatment. And then the attempt to write to the handle with a plain Write­File fails both because the number of bytes is not a multiple of the sector size and because the I/O buffer is not sector-aligned, and you get the I/O exception.

The solution to this problem depends on what you are trying to accomplish. Why are you passing the FILE_FLAG_NO_BUFFERING | FILE_FLAG_WRITE_THROUGH flags? Are you doing this just because you overheard in the hallway that it's faster? Well, yes it may be faster under the right circumstances, but in exchange for the increased performance, you also have to follow a much stricter set of rules. And in the absence of documentation to the contrary, you can't assume that a chunk of code actually adheres to your very special rules.

Like What if two people did this?, this is an illustration of another principle that many people forget to consider when working with objects they didn't write: When you write your own code, do you do this? It's sort of like the Golden Rule of programming.

Suppose you have a function which accepts a file handle and whose job is to write some data do that file handle. Do you write your function so that it performs all its I/O in multiples of the sector size from buffers which are aligned in memory in multiples of the sector size, on the off chance that somebody gave you a handle that was opened with the FILE_FLAG_NO_BUFFERING flag? Well, no, you don't. You just call Write­File to write to it, and if you want to write 28 bytes, you write 28 bytes. Even if you perform internal buffering and your buffer size happens to be a multiple of the sector size by accident, you still don't align your I/O buffer to the sector size; and when it's time to flush the final partially-written buffer, you have a not-sector-multiple write at the very end anyway.

If you don't handle this case in your code, why would you expect others to handle it in their code?

We've seen this principle before, such as when we looked at whether the Process.Refresh method refreshes an arbitrary application's windows.


Comments (19)
  1. Alexander Grigoriev says:

    @Ken Hagan,

    There is no special Win32 error code for such a case. WriteFile won’t tell you that it failed because of misaligned IO.

  2. Alexander Grigoriev says:

    Do people really need to code explicit number for  requested access and flags for CreateFile in C#? That doesn’t look right.

    And the caller might not understand what WRITE_THROUGH is for. It will slow the writes down quite a bit.

  3. Random832 says:

    "Do people really need to code explicit number for  requested access and flags for CreateFile in C#? That doesn’t look right."

    There’s no way to import the API headers directly into C#. You can make your own enum or consts if you feel like it, but there’s nothing requiring you to do so.

    As for what’s already there; for flags, there’s an enum FileOptions with no matching flag for FILE_FLAG_NO_BUFFERING [which is another hint that .NET isn’t designed to be able to use it] – and for access there’s FileAccess with no matching flag for… er, whatever 0x40000000 is supposed to be.

  4. Joshua says:

    Well since managed code can’t construct a byte array that refers to a given pointer and length and buffers for FILE_FLAG_NO_BUFFERING are supposed to be allocated by VirtualAlloc I really don’t think this is such a good idea either.

    I’m sure someone had a good use for FILE_FLAG_NO_BUFFERING but the only time I ever used it is when I happened to meet the rules by accident.

  5. TJ says:

    Maybe it’s just me (I’m a C# .NET guy) but I’m pretty confused about why someone would P/Invoke and use Win32 functions to open/create a file when they’re using .NET, which has managed functions to accomplish the same thing.

    [“I read somewhere that write-through and no-buffering give better perf, and who wouldn’t want better perf?” -Raymond]
  6. TJ says:

    Just FYI, the comment captcha seems to be broken in Firefox, it always tells me "The code you entered was invalid." Works fine in IE.

  7. Ken Hagan says:

    In fairness, whoever wrote the error text didn’t help matters.

     "Most likely the file will become too long or the handle was not opened to support synchronous IO operations."

    I’d say "most likely" no-one has ever collected statistics on what actually causes this exception in practice on customer systems.

    I’d also hazard a guess that at the point where the error was detected, there was enough information to say "you opened the handle for unbuffered access but then attempted a mis-aligned IO operation", but by the time the error percolated back to user space it had been replaced by something entirely generic and uninformative.

    Perhaps the text *should* have read, "If you are the developer of this software, try debugging on a checked build, where there is some chance of a more informative message appearing in the debug window. If you are the end-user, you’ll have to ask the developer why they’ve failed to handle this error themselves and have instead used FormatMessage to throw an implementation detail in your face.". That may sound verbose, but in fact it can be re-used as the text for nearly all error codes so you’d get a (slightly) smaller Windows build and developers would be less inclined to delegate their responsibilities for error diagnosis and reporting to a dumb API like FormatMessage.

    The only flaw in this plan is that MS don’t really support the checked build. It’s typically months, if not years, behind the retail build in terms of bug fixes and a simply amazing number of device drivers blue-screen the system. (Well, *I* think it’s amazing. I’d have thought that "not BSOD-ing the checked build at boot time" would be one of the requirements for WHQL signing, but experimentally this is not the case.)

  8. sukru says:

    I believe it’s better to trust the OS on virtual memory and I/O scheduling instead of trying to do it yourself. There are so many ways you can shoot yourself in the foot, and so many cases that you’re not even aware of (for example did you take ordering requests into consideration?).

    Of course there are always exceptions. Maybe you’re building a database system. Then the question is whether C++ would be a better choice for such a case.

    Anyways Raymond thanks for the post. It’s an interesting reminder to the mind of premature optimization.

  9. Nik says:

    It’s sad that there are people who feel the need to clutter up their C# code with ugly Kernel32 calls just because they heard it might be slightly faster.  And then it doesn’t even work.  I’d rather have my file writes be a little bit slower in exchange for just creating the file with

    new FileStream("foo.txt", …)

  10. Cheong says:

    These kind of things can happen if you’re porting C++ code to C#.

    Here you see the CreateFile() uses these special flags, and you have no hint why these flags are there, so you want to keep it.

    Given you’ve seen examples that you can use CreateFile with FileStream involving COM ports or other direct device names, you can run into this kind of problem easily.

  11. Yuhong Bao says:

    I wonder what happens if you try to create a FileStream out of say a Winsock socket.

  12. Gabe says:

    TJ: The typical reason to use P/Invoke to open a file is that you want to use a filename that the CLR doesn’t allow. Examples of this are devices, raw disks, and long (>MAXPATH) filenames.

    My very first .Net app was a POS system that needed to print receipts via a serial port, and the first release of .Net didn’t provide a SerialPort class. Since the file APIs didn’t let me open the device, I had to use P/Invoke.

  13. Dean Harding says:

    I agree the error message is probably not the best. I would say that exception messages should never try to guess the *cause* of the error, and just report the error itself. Something like "IO operation not supported" would have been enough.

    It might not have helped the programmer in this case, but at least there’s no chance of it sending you down the wrong path…

  14. Mike Dimmick says:

    The only time I’ve seen this combination of flags used is SQL Server’s transaction logs. SQL Server needs to be sure that the data was fully flushed to the disk before telling the client that the transaction was committed. It can then write back the data at its leisure.

    However, consumer disks will commonly ignore the no-buffering flag and actually buffer on the drive’s own cache, which can lead to writes not actually being recorded on the disk when a failure occurred, even though the drive said it had written the data. That can cause problems with database recovery. (It can also cause problems with NTFS, other journalled file systems, and anything else that uses the write-ahead logging protocol.)

    Implementing write-ahead logging in C# could be done, but as Raymond says, you have to meet the requirements. You might well be better off just P/Invoking WriteFile rather than trying to wrap it in a FileStream. Use a SafeFileHandle if you’re using .NET 2.0 or later.

  15. Teo says:

    Why "hearing" when you can actually *measure* it? If you align your buffers on page boundaries and sector sizes, then you help the memory manager and all participants in the i/o stack – the file system, the volume manager, the drive controllers, heck the firmware that remaps sectors will be grateful to you too! That’s a common sense, backed by numerous speed tests. Obviously, no caching and/or immediate write should be used ONLY when required, which usually is when you are writing some kind of backup application or write down live video stream. For everyday programs, they actually will do more harm, than benefit. Actually, if you omit them, you allow the cache manager to do lazy writes in large chunks on background (most probably in another thread), so your application is faster when you do NOT use them :)

    On why do you want to p/invoke CreateFile? Well, aside from using special flags, you may want to use special *names* – for example ??volume{xxxxx}my-pathmy-file.

    As to why .Net does not have these flags as enums – I think it’s because the .Net I/O classes are platform-agnostic, so they cannot expose Windows-only functionality as FILE_FLAG_NO_BUFFERING | FILE_FLAG_WRITE_THROUGH. Remember, the core of .Net is an international standart and is OS- and hardware- portable!

  16. Anon says:

    @Teo:

    "As to why .Net does not have these flags as enums – I think it’s because the .Net I/O classes are platform-agnostic, so they cannot expose Windows-only functionality as FILE_FLAG_NO_BUFFERING | FILE_FLAG_WRITE_THROUGH. Remember, the core of .Net is an international standard and is OS- and hardware- portable!"

    This is hilarious.

    The reason .NET doesn’t have these flags is quite simply because those flags are not and were not ever designed for use in .NET.  .NET is in this case directly invoking an API that exists outside of the .NET framework.

    The developers of that API provide headers that include symbolic constants for values of meaning to that API.  .NET doesn’t/can’t use those headers.

    If you use such API’s on a regular basis you can of course translate those headers for more convenient on-going use with .NET, just as you can translate those headers for use in Pascal or other languages able to use those API’s directly but the use of which is not directly supported by the provider of the API itself.

    There is no need to invent some pretentious and self-righteous/aggrandising bullsh*t justification.  "International hardware portable standard" indeed.  ROTFLMAO

    There would be no problem with a framework with such ambitions providing platform specific support for specific platforms, should it so desire.  That would not detract or impinge on the delivery of such ambitions at the core.

    (i.e. a System.Windows namespace containing Windows specific file I/O classes, only available on the Windows platform variant of the .NET framework)

    The omission in this case is simply, as in so many glaring gaps in the .NET framework, one of oversight or simple incompleteness.

    Geeez, I haven’t laughed so much in a long time, so I guess I should at least thank you for that.

  17. The_Assimilator says:

    @Anon

    "The omission in this case is simply, as in so many glaring gaps in the .NET framework, one of oversight or simple incompleteness."

    Or the fact that for 99.99999% of people using .NET, the standard IO libraries work well enough that they never need to bother with P/Invokes.

    And 99.99999% of the time, if you’re using P/Invokes for performance reasons, you’re either optimising the wrong code or you should be using a different language.

    @Teo

    You might want to look at the Microsoft.Win32 namespace to see how wrong you are.

  18. 640k says:

    I would like to discuss the double unknown values.

    A "unknown" result should be every value not defined, not a specifik defined one.

  19. Teo says:

    @The_Assimilator : Well, why do you think that all the stuff in Microsoft.Win32 namespace is there in the first place, instead of various System.* where it actually belongs? Indeed, it’s there, because there is platform-independent, and Windows-only parts of .Net and when you *explicitly* says in your code "using Microsoft" you know what you are doing and your code won’t be portable.

    @Anon: My English is not sufficient to process your message but I am happy that I make you smile (even for the wrong reason).

Comments are closed.


*DISCLAIMER: I DO NOT OWN THIS CONTENT. If you are the owner and would like it removed, please contact me. The content herein is an archived reproduction of entries from Raymond Chen's "Old New Thing" Blog (most recent link is here). It may have slight formatting modifications for consistency and to improve readability.

WHY DID I DUPLICATE THIS CONTENT HERE? Let me first say this site has never had anything to sell and has never shown ads of any kind. I have nothing monetarily to gain by duplicating content here. Because I had made my own local copy of this content throughout the years, for ease of using tools like grep, I decided to put it online after I discovered some of the original content previously and publicly available, had disappeared approximately early to mid 2019. At the same time, I present the content in an easily accessible theme-agnostic way.

The information provided by Raymond's blog is, for all practical purposes, more authoritative on Windows Development than Microsoft's own MSDN documentation and should be considered supplemental reading to that documentation. The wealth of missing details provided by this blog that Microsoft could not or did not document about Windows over the years is vital enough, many would agree an online "backup" of these details is a necessary endeavor. Specifics include:

<-- Back to Old New Thing Archive Index