Date: | January 26, 2018 / year-entry #23 |
Tags: | code |
Orig Link: | https://blogs.msdn.microsoft.com/oldnewthing/20180126-00/?p=97905 |
Comments: | 7 |
Summary: | Some memory mapping magic. |
Last time, we described how you can become the page access manager for a range of pages, but it required that all the accesses came from one thread at a time because you don't want another thread to be able to access the memory while it is still being prepared. That requirement exists because we are preparing the pages in place, and once you unprotect the page so you can prepare the page, another thread can sneak in and see the pages before they're ready. Let's see what we can do to get this to work in the multithreading case, too.
Unfortunately, I don't see a version of So we'll have to use a different trick: mapping the same block of memory into two locations. We'll take the trick a step further and map the same memory twice, but with different permissions.
First, create a shared memory block with
Next,
map the shared memory block with
Use When an access violation occurs and you want to swizzle some memory and map it in, here's what you do: Use the faulting address to figure out which page of data needs to be swizzled and mapped in. Use some sort of synchronization to make sure only one thread is doing the swizzling for this page. If you discover that the page has already been swizzled, then you are done because the other thread already did the work for you.
Otherwise, you are the first thread to handle the access violation.
Find the corresponding page in your file mapping and
use Use this second view to create the data that you eventually want to make visible to the client. Note that we have two views to the same data: A no-access view that the client knows about and a read-write view that only you know about. When you're happy with the page of data, you can unmap the second view since you don't need it any more.
Use
Similarly, when you encounter a write access violation on
a page in the client-visible view,
you mark the page as dirty and upgrade the page to
Notice that the client-visible file mapping now contains a mix of no-access pages, read-only pages, and read-write pages. There are some obvious optimizations you can perform here. First of all, you don't have to create a single file mapping for everything. Creating the file mapping will take a commit charge for the entire size of the mapping, even if you end up not using all of it. Instead, you can start with a small file mapping (say, one megabyte), and when you use up all those pages, you create a new file mapping to hold the next megabyte. This creates extra bookkeeping for your page management code, but you won't have more than a megabyte of "extra" memory committed. Another optimization is to cache the views that you use to prepare the swizzled pages. At one extreme, you could just map them in as read-write and just leave them mapped indefinitely. Or you could keep the few most recent views around, hoping for data locality. Anyway, that's the sketch of how you can have a process-wide block of user-mode-managed addresses where you control what happens the first time the client reads from or writes to that page. |
Comments (7)
|
I’ve long wondered something about mapping the same memory at two different addresses. If you program in assembly, then it makes perfect sense. If you program in some high-level language, then maybe it also makes sense, depending on the language spec. But what if you program in C? Consider:
void f(int* x, int* y) {
if(x==y) return;
*x = 4;
*y = 8;
assert(*x == 4); // can this check be optimized out??
}
The pointers are not const, so we are allowed to write to them.
There are no threading games going on.
They are not volatile, so presumably anything we wrote we can read back unchanged.
But if x and y are two mappings of the same memory, this fails.
Does this mean that to be fully standards compliant, we should declare them volatile, or is there some subtlety at work here?
The use of mapped address is like the use of “pointer to pointer” in C.
In this case, it’s “You have two pointers. One points to the memory address directly, the other points to some memory location that contains address to the memory address referenced by the first pointer”.
Yes but the question here was what the compiler’s view on this would be, right?
I guess it won’t optimize this away because as soon as you dereference a pointer, the compiler can never know what’s the value you get, even if you accessed the same pointer one line above. Even when you application is single-threaded, *another process* may have changed the memory in the meantime (in case of shared memory), or maybe you accessed some hardware I/O which is done through memory access (there are even cases in which a “write” is actually a call to kind of a “function implemented by hardware” and the “read” will give you something entirely different). So I would assume that this check can in never be optimized out.
C and C++ have something called the Strict Aliasing Rule. It states that no two parameters to the same function that are pointers to different fundamental types may alias. Implied by this wording is that if they *are* of the same fundamental type, then they *might* alias, which rules out certain optimizations. In some variants, such as C99 with its “restrict” keyword, you can actually tell the compiler, “Even though there are other pointer parameters of the same type, I promise they won’t alias, go ahead and optimize!”
But here, we explicitly check that x and y are not equal, so the compiler “knows” they don’t alias.
Huh. It wouldn’t have occurred to me that you could change the VirtualProtect settings on a block of memory that came from MapViewOfFile. I would’ve expected that MapViewOfFile “owns” that memory. Changing its protection feels like visiting someone’s house and replacing their curtains.
The missing version of VirtualAlloc would be quite useful for expanding sparse matrices, without having to copy them to the new location. Ran out of column indices? VirtualAlloc a new larger block of memory, and reuse the existing memory block for the known data.