Date: | July 25, 2006 / year-entry #248 |
Tags: | other |
Orig Link: | https://blogs.msdn.microsoft.com/oldnewthing/20060725-00/?p=30383 |
Comments: | 16 |
Summary: | I noted last time that you can concoct situations that force the creation of a stub for an imported function. For example, if you declare a global function pointer variable: DWORD (WINAPI *g_pGetVersion)() = GetVersion; then the C compiler is forced to generate the stub and assign the address of the stub to the g_pGetVersion... |
I noted last time that you can concoct situations that force the creation of a stub for an imported function. For example, if you declare a global function pointer variable: DWORD (WINAPI *g_pGetVersion)() = GetVersion;
then the C compiler is forced to generate the stub and assign
the address of the stub to the
The C++ compiler, on the other hand, can take advantage of some
C++ magic and secretly generate a "pseudo global constructor"
(I just made up that term so don't go around using it like it's
official or something)
that copies the value from the imported function address table
to the // file1.cpp #include <windows.h> EXTERN_C DWORD (WINAPI *g_pGetVersion)(); class Oops { public: Oops() { g_pGetVersion(); } } g_oops; int __cdecl main(int argc, char **argv) { return 0; } // file2.cpp #include <windows.h> EXTERN_C DWORD (WINAPI *g_pGetVersion)() = GetVersion;
The rules for C++ construction of global objects is that global
objects within a single translation unit are constructed in the
order they are declared (and destructed in reverse order),
but there is no enforced order for global objects from separate
translation units.
But notice that there is an order-of-construction dependency
here.
The construction of the
It so happens that the Microsoft linker constructs global
objects in the order in which the corresponding OBJ files are
listed in the linker's command line.
(I don't know whether this is guaranteed behavior or merely
an implementation detail, so I wouldn't rely on it.)
Consequently,
if you tell the linker to link
Even stranger: If you rename
But what happens if you mess up and declare a function as
|
Comments (16)
Comments are closed. |
The linker doesn’t really know anything about constructing C++ global objects. All it does is link together sections of the .obj files to form the same-named sections of the final file. One of the special features it has is that if a section in the .obj has a $ sign in it, it sorts the objects by the text appearing after the $ and combines them together to form the section with the name on the left. So, if sections in the whole link job exist named .CRT$XCA, .CRT$XCU, and .CRT$XCZ exist, the sections are sorted so that XCA is sorted first, then XCU, then XCZ. The order of objects with identical trailers (e.g. if multiple .obj files have .CRT$XCU sections) is not specified.
I’m not explaining this well.
Anyway, in crt0init.c, you can see that the C run-time library declares global variables __xc_a in the .CRT$XCA section and __xc_z in the .CRT$XCZ section. Then there’s a linker directive to tell it to merge the .CRT section into the .data section. If you use a global object with a constructor, the compiler generates a .CRT$XCU section containing a pointer to that constructor. The linker’s magic with the $ sections causes a function pointer table to be constructed.
In [w]{Win|Dll}MainCRTStartup, there’s a call to cinit, which is implemented in crt0dat.c. This calls _initterm, which simply iterates through the table calling every function pointer. Theoretically, if there were CRT global objects that needed construction before being used by user global objects, they could be given a letter earlier than U in the alphabet. In practice, there aren’t any (at least in VC6). This magic _is used (with $XI) to initialize and clean up the standard I/O library, for example.
The fact that this executes in DllMainCRTStartup is that you have to be careful of the loader lock in any global object (with a constructor) created in a DLL. It’s best not to use them.
Fascinating as always! However I didn’t think Microsoft shipped anything other than a C++ compiler these days? Is the mention of a C compiler just a historical detail or is there actually a Microsoft pure C compiler still available somewhere?
If you name your file .c rather than .cpp it runs the C compiler rather than the C++ compiler.
Cool! Thanks Brian.
Is there a way to force the compiler to use a stub instead of creating a global constructor?
Do LoadLibrary and GetProcAddress have any thread affinity at all?
I replaced a piece of code that loads msmapi32.dll dynamically as soon as my own dll is loaded (you cannot statically link to it since it can live in several places, none of them in the search path) with something like the following (Delphi):
var _MAPIInitialize : pointer = nil;
function MAPIInitialize;
begin
GetMAPIProcedureAddress(_MAPIInitialize, ‘MAPIInitialize’);
asm
mov esp, ebp
pop ebp
jmp [_MAPIInitialize]
end;
end;
GetMAPIProcedureAddress() would load msmapi321.dll if necessary and call GetProcAddress if GetProcAddress has never been called for the the given function.
In other words, the dll would get loaded as soon as an attempt is made to call any of its functions.
This works perfectly 99.9% of the time, but as soon as this code is used in a multithreaded environment, weird things start happening: some time after the code runs and loads msmapi32.dll on a secondary thread (which later terminates, but my dll stays up), I get access violations either in msmapi32.dll itself or in one of the dlls that also use msmapi32.dll (such mspst32.dll). This includes both true multithread applications written in C++ as well as the VB IDE, which runs the code in its own address space when debugging.
Yes, I do wrap the code that calls LoadLibrary and GetProcAddress in critical sections… The crashes occur seemingly out of the blue and I cannot see the call stack…
GetProcAddress probably doesn’t have any thread affinity, but MAPI is probably making some assumptions about the thread that is used to initialize it. Also, MAPIInitialize takes a parameter which you aren’t passing in your call. The documentation for MAPIInitialize specifies how the parameter should be initialized for multithreaded programs.
Don’t play these tricks just to save an extra "ret" instruction. Call MAPIInitialize according to the documentation and things will work a lot better.
Dmitry Streblechenko: (shrug) Send/PostMessage to your main thread and let it do the work. Sounds like you’ve spent enuf time on this one.
Andrew, Brian: The C++ and C compiler share a lot of the same code, the difference is that for modules compiled as C, C1.DLL is used to produce a parse tree, while for C++, C1XX.DLL is used. As you say, normally for .c files the C compiler is used while for .cpp, .cxx files the C++ compiler is used; however, you can override this behaviour by using the /Tc, /TC, /Tp, /TP switches (respectively ‘compile this file as C’, ‘compile all files as C’, ‘compile this file as C++’, ‘compile all files as C++’).
Doug,
MAPIInitialize is stil called on the same thread (and it must be called on every thread that uses MAPI, much like CoInit), it is a question of when LoadLibrary() is called which apparently makes a difference.
As for the missing parameter, it is there – just a Delphi shortcut: it allows to omit the parameters list in the implementation sesion if the function definition in the interface section lists the parameters:
function MAPIInitialize(lpMapiInit : Pointer) : HResult; stdcall;
Steve,
I don’t have a main thread – my code is in a COM dll which is called by other executables; so main thread is a relative term here – there might be a main thread MAPI-wise (that does all the MAPI related work) as opposed to the main UI thread.
It’s not clear to me how the behavior you describe conforms to the C++ standard. From my reading, g_pGetVersion is required to be initialized before any dynamic initialization takes place:
EXTERN_C DWORD (WINAPI *g_pGetVersion)() = GetVersion;
This is a non-local static of POD type that is initialized by an address constant expression. g_pGetVersion is therefore initialized before g_oops is constructed.
Unless I’m misreading the standard, you’ve just described a very interesting compiler bug.
Thank you for the very interesting post.
"Interesting point. But whether it’s conforming or not, it’s what happens, and your choices are either to accomodate this behavior or to vote with your wallet and buy a different compiler. I tend to discuss things as they actually are rather than how they ought to be in an ideal world, because it turns out we don’t live in an ideal world. -Raymond"
Raymond, just be glad you’re not me. I have to sit next to this language lawyer all day long!
To Todd Greer:
(1) Is the initializer really considered to be a constant when its value comes from an extern declaration? The external definition is not part of this translation unit.
(2) When a translation unit contains a #pragma, whether or not brought about from doing #include <windows.h>, doesn’t the standard recuse itself entirely from defining the meaning (or absence thereof) of a program?
(3) If nonetheless it’s a compiler bug, you’ll be glad to know that this isn’t the only place where you can report it without paying a fee. Visual Studio is different from Windows. For Visual Studio you can go to the following web site, report a bug, and get a “won’t fix” resolution for free:
http://connect.microsoft.com/feedback/default.aspx?SiteID=210
Indeed, __declspec is key here. As this is a documented MS extension, it is to be expected that it changes the rules regarding what it modifies.
Is this particular way in which it changes the rules documented anywhere (other than here)? I was unable to find any such documentation. I did find http://msdn2.microsoft.com/en-us/library/twa2aw10.aspx, which mentions assigning the address of a dllimport function to a global or static variable, but it did not mention this caveat.
It would appear that I should still report a bug, but that it is a documentation bug.
Norman,
(1) Whether the declaration is from this translation unit or not is irrelavent.
(2) Good point. A #pragma causes the implementation to behave in an implementation-defined manner, but the standard does require that the implementation-defined behavior be documented. As Raymond pointed out though, you’ve more or less hit it with the point that it involves a Microsoft extension.
(3) Thank you. I didn’t know that. I’ll file a documentation bug there.