Date: | August 23, 2004 / year-entry #315 |
Tags: | history |
Orig Link: | https://blogs.msdn.microsoft.com/oldnewthing/20040823-00/?p=38073 |
Comments: | 10 |
Summary: | If you've messed with the shell namespace, you've no doubt run across the kooky STRRET structure, which is used by IShellFolder::GetDisplayNameOf to return names of shell items. As you can see from its documentation, a STRRET is sometimes an ANSI string buffer, sometimes a pointer to a UNICODE string, sometimes (and this is the kookiest... |
If you've messed with the shell namespace, you've no doubt run across the kooky STRRET structure, which is used by IShellFolder::GetDisplayNameOf to return names of shell items. As you can see from its documentation, a STRRET is sometimes an ANSI string buffer, sometimes a pointer to a UNICODE string, sometimes (and this is the kookiest bit) an offset into a pidl. What is going on here? The STRRET structure burst onto the scene during the Windows 95 era. Computers during this time were still comparatively slow and memory-constrained. (Windows 95's minimum hardware requirements were for 4MB of memory and a 386DX processor - which ran at a whopping 25MHz.) It was much faster to allocate memory off the stack (a simple "sub" instruction) than to allocate it from the heap (which might take thousands of instructions!), so the STRRET structure was designed so the common (for Windows 95) scenarios could be satisfied without needing a heap allocation. The STRRET_OFFSET flag took this to an even greater extreme. Often, you kept the name inside the pidl, and copying it into the STRRET structure would take, gosh, 200 clocks (!). To avoid this wasteful memory copying, STRRET_OFFSET allowed you to return just an offset into the pidl, which the caller could then copy out of directly. Woo-hoo, you saved a string copy. Of course, as time passed and computers got faster and memory became more readily available, these micro-optimizations have turned into annoyances. Saving 200 clock cycles on a string copy operation is hardly worth it any more. On a 1GHz processor, a single soft page fault costs you over a million cycles; a hard page fault costs you tens of millions. You can copy a lot of strings in twenty million cycles. What's more, the scenarios that were common in Windows 95 aren't quite so common any more, so the original scenario that the optimization was tailored for hardly occurs any more. It's an optimization that has outlived its usefulness. Fortunately, you don't have to think about the STRRET structure any more. There are several helper functions that take the STRRET structure and turn it into something much easier to manipulate. The kookiness of the STRRET structure has now been encapsulated away. Thank goodness. |
Comments (10)
Comments are closed. |
Yes, on a 1GHz processor 1 million cycles is a millisecond, which seems way too long for a soft page fault. Raymond, surely you mean a hard page fault?
Hard page faults are killer since you are at the mercy of the disk drive. It’s not too unusual for this to be as slow as 10ms.
Soft page faults are more like 80,000 cycles according to this article http://msdn.microsoft.com/library/en-us/dnvc60/html/optcode.asp
I love the way the MSDN article "translates" times into "human" terms:
"Therefore, a typical "soft" page fault incurs a 200-microsecond penalty, which is 80,000 CPU cycles. To put that in human terms, if it took 1 second to read a byte from the primary CPU cache, it would take almost a day to process a page fault."
Erm, yeah, but your 10ms hard drive access would translate into a fifty day penalty. Kind of like physical-mailing off for a book from Botswana.
And, yes, this is a huge performance hit, no doubt. I just have a beef with their comparison.
Not all page faults incur a disk access. For instance, touching an uncommitted range will cause a page fault, but will not cause a pagefile hit (instead, you’ll eventually see a STATUS_ACCESS_VIOLATION exception in user mode).
I’m confused… how much did a page fault cost on a 25MHz machine then?
So STRRET is still around but just encapsulated?
Ah well, at least when its encapsulated, the possibility exists to get rid of a dated idea.
Skywing: Right, that’s a soft page fault. That’s why there’s the differentiation
josh:
Well, since disks haven’t sped up THAT much, page faults took less clock cycles. So if disks are twice as fast, but processors are 10x faster, then page faults would take 5x less clock cycles on the slower machine. Same with RAM. RAM hasn’t kept up with CPU clock speeds, so you can do similar kinds of math for that.