Date: | October 6, 2006 / year-entry #341 |
Tags: | code |
Orig Link: | https://blogs.msdn.microsoft.com/oldnewthing/20061006-04/?p=29453 |
Comments: | 14 |
Summary: | Back in Part 6 of the first phase of the "Chinese/English dictionary" series (a series which I intend to get back to someday but somehow that day never arrives), I left an exercise related to the alignment member of the HEADER union. Alignment is one of those issues that people who grew up with a... |
Back in
Part 6 of the first phase of the
"Chinese/English dictionary" series
(a series which I intend to get back to someday but somehow that
day never arrives),
I left an exercise related to the
Alignment is one of those issues that
people who grew up with a forgiving processor architecture tend to ignore.
In this case, the There are many variations on the alignment trick, some of them more effective than others. A common variation is the one-element-array trick: struct HEADER { HEADER* m_phdrPrev; SIZE_T m_cb; WCHAR m_rgwchData[1]; }; // you can also use "offsetof" if you included <stddef.h> #define HEADER_SIZE FIELD_OFFSET(HEADER, m_rgwchData)
We would then use
A common mistake is to use this incorrect definition for
#define HEADER_SIZE (sizeof(HEADER) - sizeof(WCHAR)) // wrong
This incorrect
macro inadvertently commits the mistake it is trying to protect against!
There might be (and indeed, will almost certainly be in this instance)
structure padding after
It is the "array of
Another minor point
was brought up by commenter Dan McCarty:
"Why is |
Comments (14)
Comments are closed. |
I’m rather fond of the SMB structure, which as far as I can tell, must have been intentionally misaligned.
A typical message starts with a 32-byte header, then a 1-byte ‘word count’, then some number of 2-byte word fields… all odd-aligned.
Alternatively, if you try and arrange that the message starts on an odd boundary so that those word fields are naturally aligned in the address space, then everything in the header will be misaligned. No win, overall.
(Yeah, I know the real reason why: it was designed in Ye Olde Days when saving 8 bits trumped anything else. Anyone who dealt with RAD50 encoding on various PDP11 operating systems remembers the pain of those days.)
The nicest way to do this is to put a zero length array at the end of the structure but, although it’s supported by msvc, gcc and C99, it’s not standard C++.
Seems like a minor point (RE: MIN_CBHUNK == 32000) but, isn’t it kind of pointless to add 32000 to a value less than 768 if you’re expecting to round up to a multiple of 32768?
I’m assuming that’s the basis of 32000, with a little breathing room in case something gets added to HEADER. Seems like a “magic number” to me; 32768-sizeof(HEADER) would be more clear that you’re really trying to allocate a minimum of 32768, despite the granularity (and despite it really being 64K making it even more moot).
But it’s wise to leave a bit of extra space for malloc() overhead. If you did "32768 – sizeof(HEADER)" but malloc() reserves, say, 8 bytes before each pointer that it returns to you for bookkeeping purposes (and to keep 8-byte alignment), then each time you allocate one of these on the heap, you’ve just extended 8 bytes into the next 32k chunk for your 32k allocation. Not clever.
C99 doesn’t support 0 length arrays, it supports "flexible array members":
http://david.tribble.com/text/cdiffs.htm#C99-fam
If you used VB6, it would automatically pad the structure so that 2 byte types have an offset inside the structure equal to a multiple of 2 and 4 byte types have a multiple of 4 offset. This only happens in memory – if you write variables of the type to a file, it takes out the padding.
Dave: The DOS redirector had to run on machines with 256K of RAM. The original redirector was something like 10K of code – an entire network filesystem in 10K. Think about it. The DOS LanMan redirector was something like 45K and BillG screamed at me for something like 20 minutes over that one.
Also, SMB was designed for an 8 bit processor, on an 8 bit processor, alignment is irrelevant.
Actually this has sort of the opposite effect of the union member that you used before. Temporarily ignoring some complicating factors, this saves memory by allowing the array of WCHARs to start at the first WCHAR-aligned location. If the array started after the end of the struct, there would have to be enough padding to match the alignment required by the entire struct. This is because trailing padding has to be enough to make the struct size as if it were an element of an array of the same kind of structs. So for example if the pointer required 8 byte alignment and size_t required 4 byte alignment and wchar_t required 2 byte alignment then the array could start after 0 bytes of padding instead of 4 bytes of padding.
(There are complicating factors because SIZE_T doesn’t have to be size_t, WCHAR_T doesn’t have to be wchar_t, size_t usually has the same alignment requirements as a pointer and it’s usually at least as strict as wchar_t, etc. These make it harder to see that the example is a possible example of that effect, but that effect still remains possible.)
But is it really worth doing that… There are a number of Win32 APIs that return pointers to structures that are defined in this way. Some APIs can be told to return how much memory is really needed before being told to return the contents. Otherwise I’d probably have got some of these computations wrong too. I haven’t noticed some xxx_SIZE macros, and FIELD_OFFSET isn’t exactly standard.
On the other hand, deliberate unalignments in some kinds of data structures is pretty reasonable. Until recently it would take more time to transmit one byte of padding over a network than to do a memory-to-memory move of a buffer to realign a bunch of contents. It still likely takes more time to read a few disk blocks full of padding than to do memory-to-memory moves to unpack their contents.
That’s a very difficult question. Of course no one wants you to write book-quality code without being paid for it. Some of us hate slave labour even when we’re not writing books. But notice how much MSDN contents are still, um, book-quality when we remember that quite a lot of books are atrociously poor quality too. You’ve already mentioned that readers sometimes have to copy code out of MSDN without understanding it. Surely there are people who have to copy code out of your blog because MSDN’s code is too garbagy and some of your articles provide fixes. So this is a tough question.
Maybe if your company could be persuaded to hire some competent programmers to fix MSDN articles, there would be less need for book-quality code in blogs. (But if this means that competent programmers would be pulled off of Vista then don’t do it. Vista still needs a few more years of work by competent programmers before it will be ready for release.)
Norman, just out of curiosity, which IT companies get your tick of approval, or are they all the same in your book?
Monday, October 09, 2006 9:32 AM by steveg
In the current environment that’s pretty difficult to answer. There still exist some companies that accept bug reports without requiring paid support incidents to be opened first. There still exist some companies that replace defective products with working ones. But in the current environment this pretty much happens only with hardware defects. For example one vendor replaced an entire note PC because the video chip was defective but they didn’t offer to replace Windows 95 by Windows NT4 SP3.
In ancient history hardware vendors often supplied their own operating systems. Some of them accepted bug reports without requiring paid support incidents to be opened first. Some were glad to make fixes. Some of them were glad to deliver fixes. That era is gone now. For those of us who remember that era, we don’t even make an active decision to compare it to the present, it just comes automatically.
Nonetheless I think everyone knows that MSDN’s sample code still needs a lot of fixing. Some of the text too. Even Mr. Chen has said so in the past. In my particular sentence that you quoted, my point was that a tech writer with knowledge of English isn’t enough, it’s necessary to fix the code too.
Once upon a time it was possible to answer some questions with "RTFM". When TFM is broken that isn’t a valid answer any more.
Sorry for two in a row, but I’ve just read that there was a period where Microsoft thought differently about quality.
http://joelonsoftware.com/articles/fog0000000043.html
>
The adoption of that methodology and the return to the former methodology must have occured during a pretty short time interval. I wonder why it didn’t stick?
Here’s an example of MSDN code which would benefit from being replaced by textbook quality code.
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/memory/base/reserving_and_committing_memory.asp
This is one example that would benefit from being fixed by someone competent at Win32 programming as well as English. Notice the use of LPTSTR variables. Notice how little effort will be needed to make it compile in a Unicode environment such as the default in Visual Studio 2005: it’s only necessary to wrap some strings in _T() macros and leave some other strings unwrapped. Notice that the resulting compiled program will still yield incorrect results.
> Why not use the feedback link at the bottom
> of the MSDN page?
The last time I did that, Microsoft sent a polite response saying that your company had received some headers from my submission but had tossed everything that I typed into the input controls in the feedback form.
The previous two times I did that, Microsoft sent responses saying that I had purchased the web site http://msdn.microsoft.com/library outside of North America and therefore only Microsoft Japan would be able to support the English-language MSDN library.
(Hmmmm. If MSDN were fixed and if programmers relied on MSDN then there wouldn’t be enough appcompat work to do any more. Then would Microsoft allow the same bug fixing talent to be applied to Windows itself or would … I don’t want to think about it.)
> Perhaps I should just create a “Norman
> Diamond complains about MSDN” thread
Don’t bother. Some time ago I gained an impression that someone at Microsoft was interested in getting bugs fixed in MSDN, but I should learn better.
Maybe Microsoft is dogfooding from MSDN as it does with Visual Studio on Vista. Maybe we can see what kind of code gets into Windows. Don’t touch a thing, just let it remain visible.
One thing I still can’t figure out though. When programmers outside of Microsoft read MSDN, should we obey the contract or just ignore it? Sometimes your blog contradicts MSDN but no sensible person wants you do slave labour to convert everything to textbook quality code. So what should we do, just ignore MSDN and join those who never read it?