Here's a question that came from a customer.
This particular example involves managed code,
but don't let that distract you from the point of the exercise.
I am trying to create a FileStream
object using the constructor that takes an IntPtr
as input.
In my .cs file,
I create the native file handle using CreateFile
,
as shown below.
[DllImport("kernel32.dll", CharSet = CharSet.Auto, SetLastError = true)]
internal static extern IntPtr CreateFile(string lpFileName,
int dwDesiredAccess, FileShare dwShareMode,
IntPtr securityAttrs, FileMode dwCreationDisposition,
UInt32 dwFlagsAndAttributes, IntPtr hTemplateFile);
IntPtr ptr1 = Win32Native.CreateFile(FileName, 0x40000000,
System.IO.FileShare.Read | System.IO.FileShare.Write,
Win32Native.NULL,
System.IO.FileMode.Create,
0xa0000000, // FILE_FLAG_NO_BUFFERING | FILE_FLAG_WRITE_THROUGH
Win32Native.NULL);
Then I create the FileStream
object as so:
FileStream fs = new FileStream(ptr1, FileAccess.Write, true, 1, false);
The fs
gets created fine. But when I try to do:
fs.Write(msg, 0, msg.Length);
fs.Flush();
it fails with the error
"IO operation will not work.
Most likely the file will become too long
or the handle was not opened to support synchronous IO operations."
int hr = System.Runtime.InteropServices.Marshal.GetHRForException(e)
Gives hr
as COR_E_IO
(0x80131620
).
The stack trace is as below.
System.IO.IOException: IO operation will not work. Most likely
the file will become too long or the handle was not opened
to support synchronous IO operations.
at System.IO.FileStream.WriteCore(Byte[] buffer, Int32 offset, Int32 count)
at System.IO.FileStream.FlushWrite()
at System.IO.FileStream.Flush()
at PInvoke.Program.Main(String[] args)
Can somebody point out what might be going wrong?
(For those who would prefer to cover their ears and hum when
the topic of managed code arises,
change FileStream
to
fdopen
.)
The comment on the line
0xa0000000, // FILE_FLAG_NO_BUFFERING | FILE_FLAG_WRITE_THROUGH
was provided by the customer, and that's the key to the problem.
It was right there in the comment, but the customer didn't understand
the consequences.
As the documentation for CreateFile
notes,
the FILE_FLAG_NO_BUFFERING
flag requires that all I/O operations on the file handle
be in multiples of the sector size, and that the I/O buffers also
be aligned on addresses which are multiples of the sector size.
Since you created the file handle with very specific
rules for usage, you have to make sure that everybody who
uses it actually follows those rules.
On the other hand,
the FileStream
object doesn't know about these rules.
It just figures you gave it a handle that it can issue normal
synchronous ReadFile
and WriteFile
calls on.
It doesn't know that you gave it a handle that requires
special treatment.
And then the attempt to write to the handle with a plain
WriteFile
fails both because the number
of bytes is not a multiple of the sector size and because
the I/O buffer is not sector-aligned, and you get the
I/O exception.
The solution to this problem depends on what you are trying to accomplish.
Why are you passing the
FILE_FLAG_NO_BUFFERING |
FILE_FLAG_WRITE_THROUGH
flags?
Are you doing this just because you overheard in the hallway that it's
faster?
Well, yes it may be faster under the right circumstances,
but in exchange for the increased performance, you also have to follow
a much stricter set of rules.
And in the absence of documentation to the contrary,
you can't assume that a chunk of code actually adheres to your
very special rules.
Like What if two people did this?,
this is an illustration of
another principle that many people forget to consider
when working with objects they didn't write:
When you write your own code, do you do this?
It's sort of like the Golden Rule of programming.
Suppose you have a function which accepts a file handle
and whose job is to write some data do that file handle.
Do you write your function so that it performs all its
I/O in multiples of the sector size from buffers which are
aligned in memory in multiples of the sector size,
on the off chance that somebody gave you a handle that was
opened with the FILE_FLAG_NO_BUFFERING
flag?
Well, no, you don't.
You just call WriteFile
to write to it,
and if you want to write 28 bytes, you write 28 bytes.
Even if you perform internal buffering and your buffer size
happens to be a multiple of the sector size by accident,
you still don't align your I/O buffer to the sector size;
and when it's time to flush the final partially-written buffer,
you have a not-sector-multiple write at the very end anyway.
If you don't handle this case in your code,
why would you expect others to handle it in their code?
We've seen this principle before,
such as when we looked at
whether the Process.Refresh
method
refreshes an arbitrary application's windows.
@Ken Hagan,
There is no special Win32 error code for such a case. WriteFile won’t tell you that it failed because of misaligned IO.
Do people really need to code explicit number for requested access and flags for CreateFile in C#? That doesn’t look right.
And the caller might not understand what WRITE_THROUGH is for. It will slow the writes down quite a bit.
"Do people really need to code explicit number for requested access and flags for CreateFile in C#? That doesn’t look right."
There’s no way to import the API headers directly into C#. You can make your own enum or consts if you feel like it, but there’s nothing requiring you to do so.
As for what’s already there; for flags, there’s an enum FileOptions with no matching flag for FILE_FLAG_NO_BUFFERING [which is another hint that .NET isn’t designed to be able to use it] – and for access there’s FileAccess with no matching flag for… er, whatever 0x40000000 is supposed to be.
Well since managed code can’t construct a byte array that refers to a given pointer and length and buffers for FILE_FLAG_NO_BUFFERING are supposed to be allocated by VirtualAlloc I really don’t think this is such a good idea either.
I’m sure someone had a good use for FILE_FLAG_NO_BUFFERING but the only time I ever used it is when I happened to meet the rules by accident.
Maybe it’s just me (I’m a C# .NET guy) but I’m pretty confused about why someone would P/Invoke and use Win32 functions to open/create a file when they’re using .NET, which has managed functions to accomplish the same thing.
Just FYI, the comment captcha seems to be broken in Firefox, it always tells me "The code you entered was invalid." Works fine in IE.
In fairness, whoever wrote the error text didn’t help matters.
"Most likely the file will become too long or the handle was not opened to support synchronous IO operations."
I’d say "most likely" no-one has ever collected statistics on what actually causes this exception in practice on customer systems.
I’d also hazard a guess that at the point where the error was detected, there was enough information to say "you opened the handle for unbuffered access but then attempted a mis-aligned IO operation", but by the time the error percolated back to user space it had been replaced by something entirely generic and uninformative.
Perhaps the text *should* have read, "If you are the developer of this software, try debugging on a checked build, where there is some chance of a more informative message appearing in the debug window. If you are the end-user, you’ll have to ask the developer why they’ve failed to handle this error themselves and have instead used FormatMessage to throw an implementation detail in your face.". That may sound verbose, but in fact it can be re-used as the text for nearly all error codes so you’d get a (slightly) smaller Windows build and developers would be less inclined to delegate their responsibilities for error diagnosis and reporting to a dumb API like FormatMessage.
The only flaw in this plan is that MS don’t really support the checked build. It’s typically months, if not years, behind the retail build in terms of bug fixes and a simply amazing number of device drivers blue-screen the system. (Well, *I* think it’s amazing. I’d have thought that "not BSOD-ing the checked build at boot time" would be one of the requirements for WHQL signing, but experimentally this is not the case.)
I believe it’s better to trust the OS on virtual memory and I/O scheduling instead of trying to do it yourself. There are so many ways you can shoot yourself in the foot, and so many cases that you’re not even aware of (for example did you take ordering requests into consideration?).
Of course there are always exceptions. Maybe you’re building a database system. Then the question is whether C++ would be a better choice for such a case.
Anyways Raymond thanks for the post. It’s an interesting reminder to the mind of premature optimization.
It’s sad that there are people who feel the need to clutter up their C# code with ugly Kernel32 calls just because they heard it might be slightly faster. And then it doesn’t even work. I’d rather have my file writes be a little bit slower in exchange for just creating the file with
new FileStream("foo.txt", …)
These kind of things can happen if you’re porting C++ code to C#.
Here you see the CreateFile() uses these special flags, and you have no hint why these flags are there, so you want to keep it.
Given you’ve seen examples that you can use CreateFile with FileStream involving COM ports or other direct device names, you can run into this kind of problem easily.
I wonder what happens if you try to create a FileStream out of say a Winsock socket.
TJ: The typical reason to use P/Invoke to open a file is that you want to use a filename that the CLR doesn’t allow. Examples of this are devices, raw disks, and long (>MAXPATH) filenames.
My very first .Net app was a POS system that needed to print receipts via a serial port, and the first release of .Net didn’t provide a SerialPort class. Since the file APIs didn’t let me open the device, I had to use P/Invoke.
I agree the error message is probably not the best. I would say that exception messages should never try to guess the *cause* of the error, and just report the error itself. Something like "IO operation not supported" would have been enough.
It might not have helped the programmer in this case, but at least there’s no chance of it sending you down the wrong path…
The only time I’ve seen this combination of flags used is SQL Server’s transaction logs. SQL Server needs to be sure that the data was fully flushed to the disk before telling the client that the transaction was committed. It can then write back the data at its leisure.
However, consumer disks will commonly ignore the no-buffering flag and actually buffer on the drive’s own cache, which can lead to writes not actually being recorded on the disk when a failure occurred, even though the drive said it had written the data. That can cause problems with database recovery. (It can also cause problems with NTFS, other journalled file systems, and anything else that uses the write-ahead logging protocol.)
Implementing write-ahead logging in C# could be done, but as Raymond says, you have to meet the requirements. You might well be better off just P/Invoking WriteFile rather than trying to wrap it in a FileStream. Use a SafeFileHandle if you’re using .NET 2.0 or later.
Why "hearing" when you can actually *measure* it? If you align your buffers on page boundaries and sector sizes, then you help the memory manager and all participants in the i/o stack – the file system, the volume manager, the drive controllers, heck the firmware that remaps sectors will be grateful to you too! That’s a common sense, backed by numerous speed tests. Obviously, no caching and/or immediate write should be used ONLY when required, which usually is when you are writing some kind of backup application or write down live video stream. For everyday programs, they actually will do more harm, than benefit. Actually, if you omit them, you allow the cache manager to do lazy writes in large chunks on background (most probably in another thread), so your application is faster when you do NOT use them :)
On why do you want to p/invoke CreateFile? Well, aside from using special flags, you may want to use special *names* – for example ??volume{xxxxx}my-pathmy-file.
As to why .Net does not have these flags as enums – I think it’s because the .Net I/O classes are platform-agnostic, so they cannot expose Windows-only functionality as FILE_FLAG_NO_BUFFERING | FILE_FLAG_WRITE_THROUGH. Remember, the core of .Net is an international standart and is OS- and hardware- portable!
@Teo:
"As to why .Net does not have these flags as enums – I think it’s because the .Net I/O classes are platform-agnostic, so they cannot expose Windows-only functionality as FILE_FLAG_NO_BUFFERING | FILE_FLAG_WRITE_THROUGH. Remember, the core of .Net is an international standard and is OS- and hardware- portable!"
This is hilarious.
The reason .NET doesn’t have these flags is quite simply because those flags are not and were not ever designed for use in .NET. .NET is in this case directly invoking an API that exists outside of the .NET framework.
The developers of that API provide headers that include symbolic constants for values of meaning to that API. .NET doesn’t/can’t use those headers.
If you use such API’s on a regular basis you can of course translate those headers for more convenient on-going use with .NET, just as you can translate those headers for use in Pascal or other languages able to use those API’s directly but the use of which is not directly supported by the provider of the API itself.
There is no need to invent some pretentious and self-righteous/aggrandising bullsh*t justification. "International hardware portable standard" indeed. ROTFLMAO
There would be no problem with a framework with such ambitions providing platform specific support for specific platforms, should it so desire. That would not detract or impinge on the delivery of such ambitions at the core.
(i.e. a System.Windows namespace containing Windows specific file I/O classes, only available on the Windows platform variant of the .NET framework)
The omission in this case is simply, as in so many glaring gaps in the .NET framework, one of oversight or simple incompleteness.
Geeez, I haven’t laughed so much in a long time, so I guess I should at least thank you for that.
@Anon
"The omission in this case is simply, as in so many glaring gaps in the .NET framework, one of oversight or simple incompleteness."
Or the fact that for 99.99999% of people using .NET, the standard IO libraries work well enough that they never need to bother with P/Invokes.
And 99.99999% of the time, if you’re using P/Invokes for performance reasons, you’re either optimising the wrong code or you should be using a different language.
@Teo
You might want to look at the Microsoft.Win32 namespace to see how wrong you are.
I would like to discuss the double unknown values.
A "unknown" result should be every value not defined, not a specifik defined one.
@The_Assimilator : Well, why do you think that all the stuff in Microsoft.Win32 namespace is there in the first place, instead of various System.* where it actually belongs? Indeed, it’s there, because there is platform-independent, and Windows-only parts of .Net and when you *explicitly* says in your code "using Microsoft" you know what you are doing and your code won’t be portable.
@Anon: My English is not sufficient to process your message but I am happy that I make you smile (even for the wrong reason).