Why does the Internet Explorer animated logo arrange its frame vertically?

Date:August 5, 2005 / year-entry #214
Tags:code
Orig Link:https://blogs.msdn.microsoft.com/oldnewthing/20050805-18/?p=34693
Comments:    17
Summary:If you ever tried to build a custom animated logo for Internet Explorer, you cetainly noticed that the frames of the animation are arranged vertically rather than horizontally. Why is that? Because it's much more efficient. Recall that bitmaps are stored as a series of rows of pixels. In other words, if you number the...

If you ever tried to build a custom animated logo for Internet Explorer, you cetainly noticed that the frames of the animation are arranged vertically rather than horizontally. Why is that?

Because it's much more efficient.

Recall that bitmaps are stored as a series of rows of pixels. In other words, if you number the pixels of a bitmap like this:

123
456
789

then the pixels are stored in memory in the order 123456789. (Note: I'm assuming a top-down bitmap, but the same principle applies to bottom-up bitmaps.) Now observe what happens if you store your animation strip horizontally:

12
AB
34
CD
56
EF
78
GH

These pixels are stored in memory in the order 12345678ABCDEFGH. To draw the first frame requires pixels 1, 2, A and B. The second frame takes 3, 4, C, and D. And so on. Observe that the pixels required for each frame are not contiguous in memory. This means that they occupy different cache lines at least, and for a bitmap of any significant size, they also span multiple memory pages.

Now consider a vertically-arranged animation strip:

12
34
56
78
AB
CD
EF
GH

Again, the pixels are stored in memory in the order 12345678ABCDEFGH, [typo fixed, 15 Aug] but this time, the pixels of the first frame are 1, 2, 3 and 4; the second frame consists of 5, 6, 7, and 8; and so on. This time, all the pixels for a single frame are adjacent in memory. This means that they can be packed into a small number of cache lines, and reading the pixels for a single image will not force you to jump across multiple pages.

Let's illustrate with some pictures: Let's say that the large animation is a series of twelve 38x38 frames, for a total bitmap dimension of 38x456. Let's assume further, for the sake of example, that it's a 32bpp bitmap and that the page size is 4KB.

If the bitmap were stored as a horizontal strip (456x38), then the memory layout would look like this, where I've color-coded each memory page.

Observe that no matter which frame you draw, you will have to touch every single page since each frame containes a few bytes from each page.

Storing the bitmap vertically, on the other hand, arranges the pixels like so:

Notice that with the vertical strip, each frame touches only two or three pages; compare the horizontal strip, where each frame touches seventeen pages. This is quite a savings especially when you realize that most of the time, the only frame being drawn is the first one. The other frames are used only during animation. In other words, this simple change trimmed 60KB out of the normal working set.


Comments (17)
  1. Mike says:

    You guys must have some pretty amazing tools to be able to spot a 60k drop in working set due to page fragmentation. What sort of profiling setups do you have? The best I’ve used is Massif, which gives you a pretty space/time graph, but given what I’ve seen of most apps 60k for the spinning logo would be lost in the noise.

  2. NeARAZ says:

    Mike, I think it’s not a drop in memory usage, it’s a drop in the number of pages that are touched while drawing a page. In other words, just making your data cache friendly – this doesn’t reduce memory usage, but can speedup things by orders of magnitude.

  3. josh says:

    So why do toolbars take horizontal strips?

  4. James Schend says:

    Hm. Personally, I’ve always used vertical animation strips when making animations for my programs… including a couple little games. You make it sound as if the standard was horizontal, and the vertical ones are freakish.

  5. Tony Cox [MS} says:

    I remember back in the days of software rasterizers for games, you saw a similar effect due to L1 cache coherency. If a polygon happened to align on screen such that the source texture data was traversed in order, then performance was measurably better than the same polygon rotated by 90 degrees. On some games we worked on, we considered even going so far as to store two versions of a source texture, and picking the one closest to the alignment of the polygon (we never did this in the end due to the extra memory cost outweighing caching benefits).

    These days, modern GPUs tackle a similar problem. They don’t actually store textures in the obvious memory order, but instead store them with the pixels mashed up in what is usually called a "swizzle" pattern. A swizzle pattern isn’t very intuitive geometrically, but in terms of the texture coordinates it amounts to interleaving the individual bits of the coordinates. If your original texel was at coordinate (X,Y), and the bit representation of X is xxxxxx and Y is yyyyyy, then the actual data is stored at the memory offset given by xyxyxyxyxyxy (interleaving bitwise). The net result of this is that texels which are close to each other in the original image are clustered close to each other in memory, regardless if they were close to each other horizontally or vertically, and therefore traversing the texture in any direction is roughly similar in cost. This is a considerable performance win. (Most GPUs are capable of reading plain old linear layouts as well, but the performance is measureably poorer.)

    Software rasterizers could benefit from the same technique, except that the cost of doing the bit interleaving in software usually outweighs the cache coherency benefits (plus, there are relatively few applications for high-performance software rasterization these days, so nobody really takes the time).

  6. waleri says:

    Alas, image lists uses horizontal bitmaps…

  7. Matt says:

    "Again, the pixels are stored in memory in the order 1245678ABCDEFGH, but …"

    I never liked pixel 3 much anyway.

  8. Nick says:

    Sorry this is off topic, but triggered by your reference to "quite a savings".

    In the UK, we make "a saving" – savings go into banks. Does anyone know where the plural crept in across the Atlantic? I’ve always been puzzled by this.

  9. Tony, the software way would be, instead of usual order VVVVVVVVUUUUUUUU, using UUUUVVVVVVVVUUUU bitorder which is very easy to generate on the fly while doing fixed-point interpolation. Look for "fatmap2.txt" for more info :)

    There is also a way to use normal coordinates for textures: swizle the screen-travesal!!! :P

  10. NeARAZ says:

    Antonio/Tony: yes, and in fact old "fast rotozoomers" did exactly that. Don’t draw the screen line by line, but instead draw it in blocks (say, 8×8 pixels).

  11. Sometimes we get so used to things being the way they are we stop questioning them. We always have the…

  12. Sometimes we get so used to things being the way they are we stop questioning them. We always have the…

  13. Sometimes we get so used to things being the way they are we stop questioning them. We always have the…

  14. Sometimes we get so used to things being the way they are we stop questioning them. We always have the…

  15. That’s just the interchange format.

Comments are closed.


*DISCLAIMER: I DO NOT OWN THIS CONTENT. If you are the owner and would like it removed, please contact me. The content herein is an archived reproduction of entries from Raymond Chen's "Old New Thing" Blog (most recent link is here). It may have slight formatting modifications for consistency and to improve readability.

WHY DID I DUPLICATE THIS CONTENT HERE? Let me first say this site has never had anything to sell and has never shown ads of any kind. I have nothing monetarily to gain by duplicating content here. Because I had made my own local copy of this content throughout the years, for ease of using tools like grep, I decided to put it online after I discovered some of the original content previously and publicly available, had disappeared approximately early to mid 2019. At the same time, I present the content in an easily accessible theme-agnostic way.

The information provided by Raymond's blog is, for all practical purposes, more authoritative on Windows Development than Microsoft's own MSDN documentation and should be considered supplemental reading to that documentation. The wealth of missing details provided by this blog that Microsoft could not or did not document about Windows over the years is vital enough, many would agree an online "backup" of these details is a necessary endeavor. Specifics include:

<-- Back to Old New Thing Archive Index