Importance of alignment even on x86 machines

Date:August 27, 2004 / year-entry #320
Orig Link:
Comments:    17
Summary:Sometimes unaligned memory access will hang the machine. Some video cards do not let you access all the video memory at one go. Instead, you are given a window into which you can select which subset of video memory ("bank") you want to see. For example, the EGA video card had 256K of memory, split...

Sometimes unaligned memory access will hang the machine.

Some video cards do not let you access all the video memory at one go. Instead, you are given a window into which you can select which subset of video memory ("bank") you want to see. For example, the EGA video card had 256K of memory, split into four 64K banks. If you wanted to access memory in the first 64K, you had to select bank zero into the window, but if you wanted to access memory in the second 64K, then you had to select bank one.

Bank-switching makes memory access much more complicated, For example, if you want to copy a block of memory into bank-switched memory, you have to check when you are going to cross a bank boundary and break the copy up into pieces. If you are doing something that requires non-sequential access (say, drawing a diagonal line), you have to check when your line is going to cross into another bank.

To simplify matters, Windows 95 had a driver called VFLATD that made bank-switched memory look flat to the rest of the system. Flattening the bank-switched memory model was also crucial for DirectDraw support; in particular, the IDirectDrawSurface::Lock method gave you direct access to a (seemingly) flat expanse of video memory. For example, if the application wanted to see a 256K surface and accessed memory in the first 64K of memory, the VFLATD driver would select bank zero and map the 64K physical memory window into the first 64K of the virtual 256K memory window.

This worked great as long as everybody uses only aligned memory accesses. But if you access unaligned memory, you can send VFLATD into an infinite loop and hang the machine.

Suppose you make an unaligned memory access that straddles two banks. This memory access can never be satisfied. A page fault is taken on the lower portion of the unaligned access, and VFLATD maps the lower bank into memory. Then a page fault is taken on the higher portion of the unaligned access, and VFLATD now has to map the upper bank; this unmaps the lower bank, since the video card is bank-switched and only one bank can be mapped ata time. Now a page fault is taken on the lower portion, and the infinite loop continues.

Moral of the story: Keep those memory accesses aligned, even on the x86, which most people would consider to be one where it is "safe" to violate alignment rules.

Next time, another example of how misaligned data access can create bugs x86.

Comments (17)
  1. How is the memory banking different from considering memory as say 1024 8-bit memory banks in the kase of 1KiB of memory? You have to specify the high bits to access higher memory.

  2. Did EGA really had 256K ? I though VGA cards started at 256K ?

  3. mschaef says:

    The original EGA card had 64K standard and was upgradable to 256K. 640x350x16 colors requires around 110K.

    VGA was also 256K, IBM’s other PS/2 video standard, the MCGA (used on the Model 25 and 30) had 64K. It had the VGA’s 256 color 320×200 mode, as well as a 640×480 monochrome mode.

  4. Mike Dimmick says:

    Memory banking: you have to use some out-of-band mechanism (writing a value to a device register through a memory write or an I/O write) to change which bank of memory is visible through a ‘window’ of addresses. Example: the Sinclair ZX Spectrum 128K models did this to allow addressibility of 128KB of RAM where the Z80 processor used could only address 64KB of addresses, including ROM (this computer used a separate I/O address space). 16K of the address space, between 0xC000 and 0xFFFF, could be mapped to any of 8 banks of memory. Banks 2 and 5 were permanently mapped at 8000 – BFFF and 4000 – 7FFF respectively. The hardware could also swap between two ROMs at 0000 – 3FFF on the cassette versions and an additional two ROMs (total of 4) on the disk versions. To configure the memory, you used I/O port 7FFD.

    Sorry, indulging my archaic computing knowledge.

    Similar concepts were available in Expanded Memory on the PC, using a window of memory addresses in the 640KB – 1MB region to map memory above 1MB on 16-bit machines in real mode. On 386 machines, the EMM386 TSR was capable of faking Expanded memory using Extended memory. Windows still does this for DOS programs, if required.

    Finally, AWE provides similar capabilities for switching reserved physical memory into your virtual address space. See Raymond’s series of /3GB articles at

  5. TristanK says:

    Ahh, sweet memory. I saved up for months for a Paradise EGA card (256Mb) for my 10Mhz 640K PC XT. It allegedly allowed 640x480x16, though my monitor was capable of only 640x350x16.

    I don’t remember getting Windows 3.0 to work on it (vague memory of a memory limitation), but I managed to get it working in 640×200 mono CGA mode. I was happy, and it was slow, and I gave up and went back to playing Ultima 5.

    I think Speedball was the only game I remember that gave a choice between EGA (64K) and EGA (256K), and I never could tell the difference between the modes. I’m guessing that one was "better" cos it (possibly) used whole banks as back buffers…

  6. Adrian says:

    "But if you access unaligned memory, you can send VFLATD into an infinite loop and hang the machine."

    I wonder if that’s why so many of the machines I’ve owned will often hang when a DirectX application starts. I’ve had Win95, 98, and NT4 boxes that can’t launch more than one DirectX application per session. I still use the Win98 one, and, even with all the latest drivers, I run into this problem routinely. I set IE to prompt for ActiveX controls so gratuitous Flash advertisements don’t lock me up.

  7. rentzsch says:

    Couldn’t VFLATD be aware of the bank boundaries and make two consecutive reads itself? In crude ascii art, assume we want to access ‘xxxx’ which spans two banks:


    VFLATD could read the lower bank first (possibly having to read up ‘aaxx’, and discarding the ‘aa’) and then the higher bank.

    I’d imagine this would slow down VFLATD in general, so maybe it was a performance call?

  8. RJ says:

    Tangential topic: "atomic" operations. The Interlocked functions also fail for about the same reason. But I guess the resulting bug is not as apparent as the machine hanging!

  9. Mike Dunn says:

    Ah, Ultima V, probably the best game I ever played on the old C=64.

  10. Anonymous Coward says:

    I once worked at ICL (it used to be a once proud British computer company somewhat similar to IBM in scope).

    At one point they made special EGA cards that had an extra display area on the bottom. It was intended for customer service agents. The intention was that normal programs ran in the EGA display area, and then caller id and other status information would appear in the special area.

    The resulting overall display area was 640×480 (ie VGA) but EGA used digital signals whereas VGA onwards has been analog signals. It took me quite a while to try to get the damn card to work properly since it wasn’t quite clear what the issue was (ordinary EGA or VGA drivers didn’t quite work). Then someone remembered how special the card was …

    ICL was also one of the companies that paid for MS-DOS 4. Not the MS-DOS 4 you know about now, but the special multi-tasking version. I keep kicking myself that I never saved a copy.

    And I only had a 16K ZX Spectrum. Most of the games in the magazines required the 48K model. Consequently when I typed them in I had to figure out how to make the code shorter. Other developers still complain about the terseness or even lack of my comments :-)

  11. AC: Drop me a line on my blog :) I may know you IRL. Especially since I wrote DOS 4.1 (ok, Kevin, Mike, and Matthew also did a lot of work too).

  12. Ahh… AC… how I loved the ZX Spectrum :)

  13. josh says:

    If you’ve got two windows or can align them on a finer granularity than the window size, this shouldn’t be a problem. But EGA probably doesn’t meet either of those conditions.

  14. Anonymous Coward says:

    I apologise in advance to Raymond for threadjacking, but would like to nominate the Spectrum ROM disassembly book as the greatest computer book of all time. You had to be there :-)

    If you weren’t, software and hardware developed differently between the US and the UK in the 80’s. The US developed (multiple) thousand dollar machines such as Apple II and the PC. Even the mid to top range Amiga and Commodore machines were approaching those sort of price points.

    On the other hand, the British market was very low. Most machines were around the 100 pound (GBP) mark, were more limited but innovative for the price, and very widely owned.

    Ultimately the US ended up being the leader in business software. A generation of British programmers grew up on the limited machines and turned out to be excellent game programmers. For the last decade, a lot of game software was written by those folks, and mostly published by US companies.

    See this page for more info on the Spectrum:

    Does Microsoft have any intention of a software museum? I am thinking downloadable copies of DOS 1.0 and Windows 1.0 that can be run in VirtualPC. Even a book with the dis-assembly of Bill’s first version of Basic :-)

  15. Norman Diamond says:

    8/27/2004 9:38 AM TristanK

    > Ahh, sweet memory. […]

    > […] (vague memory of a memory limitation),

    No wonder end users get confused by all those kinds of memories.

  16. JamesW says:

    Just to continue the threadjack – anyone who’s interested can download a pdf of the complete Spectrum ROM disassembly from here:

    Sinclair for Ever!

  17. Sweet nostalgy.

    I know people who would "die" for this book something like ten years ago. As this book was unavailable in Latvia (and in all ex-USSR) we had legends and mentions going around about this book, but nobody had ever seen the book itself :) (and now I see IT!)

    It’s most surprising how the ROM was developed so (almost) bug-free, if you think about the features it offers, as well as the programming conditions of that time (tape storage, lack of debuggers, memory limtations..) It’s a wonder ANYTHING got written at all :)

Comments are closed.

*DISCLAIMER: I DO NOT OWN THIS CONTENT. If you are the owner and would like it removed, please contact me. The content herein is an archived reproduction of entries from Raymond Chen's "Old New Thing" Blog (most recent link is here). It may have slight formatting modifications for consistency and to improve readability.

WHY DID I DUPLICATE THIS CONTENT HERE? Let me first say this site has never had anything to sell and has never shown ads of any kind. I have nothing monetarily to gain by duplicating content here. Because I had made my own local copy of this content throughout the years, for ease of using tools like grep, I decided to put it online after I discovered some of the original content previously and publicly available, had disappeared approximately early to mid 2019. At the same time, I present the content in an easily accessible theme-agnostic way.

The information provided by Raymond's blog is, for all practical purposes, more authoritative on Windows Development than Microsoft's own MSDN documentation and should be considered supplemental reading to that documentation. The wealth of missing details provided by this blog that Microsoft could not or did not document about Windows over the years is vital enough, many would agree an online "backup" of these details is a necessary endeavor. Specifics include:

<-- Back to Old New Thing Archive Index