How can I preallocate disk space for a file without it being reported as readable?

Comments (18)

Matteo Italia says:

July 14, 2016 at 10:21 am

Given that a file is more like an std::deque than an std::vector (data is allocated in biggish chunks and is never copied around) it’s not really clear what kind of performance advantage they are after by preallocating everything; after all, deque itself doesn’t have a reserve method because it’s mostly useless. Even the additional locality shouldn’t matter much, given that a log is normally append only (and is read sequentially). Maybe the customer had some mistaken idea about the inner workings of the file system?
1. Brian says:
  
  July 14, 2016 at 12:06 pm
  
  I see two reasons they may want to do this:
  1) you get fail on open (well, failure around the time the Open happens) rather than fail on write
  2) you get to reserve your space up front, making sure that no other user of the volume can take of the space you need (again working to prevent fail on write). It’s like sending someone into the movie theatre early to reserve 8 seats before anyone else arrives.
  1. DWalker says:
    
    July 15, 2016 at 7:28 am
    
    But… but.. let’s say the disk only had 50GB of space left. Which is better: To write 50GB of log, and then fail, or to fail when trying to create a 100 GB log file? In the second choice, nothing gets logged and the program might not even start.
    1. Brian says:
      
      July 15, 2016 at 7:41 am
      
      In that case, in a program bigger than a “little program”, you have a second failure path that creates a smaller log file. In that log file, you write “Couldn’t create Log file – Quitting” (or, you steal as much space as makes sense and pre-allocate a smaller log). The idea is to reduce the likelihood of a fail-on-write-to-the-log as much as you can.
      1. Kevin says:
        
        July 16, 2016 at 9:02 pm
        
        Logging is a particularly good example of this pattern because it is a cross-cutting concern. A well-written application is likely to perform logging calls at many different layers of abstraction and in many different contexts. It is not practical to correctly handle a logging failure at every one of these call sites, so most sane logging frameworks just swallow logging errors silently (with perhaps a message to stderr, if you’re lucky). In this regard it is much like how many garbage-collected languages handle throwing finalizers: you can’t clean up from a failed cleanup, nor is the application in a good position to decide what to do about it, so just ignore it and destroy the object anyway.
Fleet Command says:

July 14, 2016 at 10:45 am

What are those “LL” in front of 1024? What do they do?
1. TimothyB says:
  
  July 14, 2016 at 11:27 am
  
  It’s a C++ number suffix to say that the constant is a long long.
2. Matt Denham says:
  
  July 14, 2016 at 11:37 am
  
  It indicates that they’re of type “long long”, which is important here mostly to ensure that the multiplication ends up with the correct type (if it stayed in a 32-bit type, it’d end up as 0 instead since 100GB = 0 modulo 2^32).
3. Steve says:
  
  July 14, 2016 at 11:55 am
  
  LL indicates that the integer literal should be treated as type “long long”.
  
  http://en.cppreference.com/w/cpp/language/integer_literal
4. Brian says:
  
  July 14, 2016 at 12:01 pm
  
  From https://msdn.microsoft.com/en-us/library/c70dax92.aspx
  To specify an unsigned type, use either the u or U suffix. To specify a long type, use either the l or L suffix. To specify a 64-bit integral type, use the LL, or ll suffix.
  1. Fleet Command says:
    
    July 15, 2016 at 7:12 am
    
    Thanks for the collective answers. :)
    
    Why can’t the compiler decide that the way .NET Framework interpreter and Delphi compiler do? Is this some sort of power-developer feature?
    1. Brian says:
      
      July 15, 2016 at 7:37 am
      
      Well, things like C++, C# and Delphi are different languages.
      auto myint = 0;
      auto mylong = 0L;
      auto myreallylong = 0LL;
      Create two 32-bit numbers (one an int and the other a long) and a 64-bit “long long” in MS C++ (remember, C++ does not specify the bit length of it’s types). In C#:
      var myint = 0;
      var mylong = 0L;
      specify 32 and 64-bit integers (in the .NET world, the bit-length of integral types is part of the standard).
      1. Fleet Command says:
        
        July 15, 2016 at 8:35 am
        
        In way, I was asking why “C++ does not specify the bit length of it’s types”? But I guess you implied the answer: The same reason that Wright brothers’ plane didn’t have jet engine. So, thanks.
    2. Wear says:
      
      July 15, 2016 at 3:28 pm
      
      .NET actually has the same issue.
      
      long l = 1024 * 1024 * 1024 * 100; -> “Error CS0220: The operation overflows at compile time in checked mode”
      Dim l As Long = 1024 * 1024 * 1024 * 100 -> “Error BC30439: Constant expression not representable in type ‘Integer'”
      
      The compiler treats the literals as int32s and preforms int32*int32 multiplication on them which overflows. If you add the L suffix everything works because now the literals are all int64s and you are performing int64*int64 multiplication.
      
      long l = 1024L * 1024L * 1024L * 100L;
      Dim l As Long = 1024L * 1024L * 1024L * 100L
      1. Fleet Command says:
        
        July 16, 2016 at 8:25 am
        
        Yes. Interesting how I never run into this on .NET: I never had to manually allocate a very large number to my variables during my career.
      2. cheong00 says:
        
        July 17, 2016 at 8:04 pm
        
        Actually, you might need the suffix to declare constants for use in Interop too.
        
        Taking example for a recent support case in MSDN forum:
        Public Enum ACCESS_MASK As UInteger
        ‘…
        GENERIC_READ = &H80000000UI
        ‘…
        End Enum
        
        Try take away “UI” at the end and see if it can compile.
Scarlet Manuka says:

July 14, 2016 at 8:53 pm

Reserve disk space with this one weird trick!

(Sorry, couldn’t help myself. Looks like quite a neat solution actually.)

I agree with Brian: it’s not a bad thing to make sure you have space to write to your log file. If something goes wrong and the disk fills up, at least you can write ‘couldn’t generate output, disk full’ to your log file. Yes, you’d probably find out from disk usage monitoring, but having it in the log can save time troubleshooting, particularly if it’s only a brief condition – for instance if your app cleans up a large output file after a failed write.
Erik F says:

July 15, 2016 at 10:34 am

This seems similar to fallocate() on Linux, which I have used a couple of times to achieve the same sort of result. File transfer programs don’t seem to use it but I think that this method would be handy when you are copying a large file because you can guarantee beforehand that the copy won’t fail due to lack of space on the destination. I’m sure there’s a good reason but off the top of my head I can’t think of what that might be.

The documentation for SetFileInformationByHandle() seems to imply that not all file systems support all features: is there any documented guidance regarding what common file systems support which information class?

Comments are closed.

Date:	July 14, 2016 / year-entry #147
Tags:	code
Orig Link:	https://blogs.msdn.microsoft.com/oldnewthing/20160714-00/?p=93875
Comments:	18
Summary:	Set the file allocation information.