Date: | August 15, 2007 / year-entry #299 |
Tags: | code |
Orig Link: | https://blogs.msdn.microsoft.com/oldnewthing/20070815-00/?p=25573 |
Comments: | 55 |
Summary: | One of the differences between C++ and C# is when static constructors run. In C++, static constructors are the first thing run in a module, even before the DllMain function runs.¹ In C#, however, static constructors don't run until you use the class for the first time. If your static constructor has side effects, you... |
One of the differences between C++ and C# is when static constructors run.
In C++, static constructors are the first thing run in a module,
even before the Consider the following program. It's rather contrived and artificial, but it's based on an actual program that encountered the same problem. using System; using System.Runtime.InteropServices; class Program { [DllImport("kernel32.dll", SetLastError=true)] public static extern bool SetEvent(IntPtr hEvent); public static void Main() { if (!SetEvent(IntPtr.Zero)) { System.Console.WriteLine("Error: {0}", Trace.GetLastErrorFriendlyName()); } } }
This program tries to set an invalid event, so the call to
class Trace { public static string GetLastErrorFriendlyName() { return Marshal.GetLastWin32Error().ToString(); } } Run this program, and you should get this output: Error: 6
Six is the expected error code, since that is the numeric value of
You don't think much of this program until one day you run it and instead of getting error 6, you get something like this: Error: 126 What happened?
While you weren't paying attention, somebody decided to do
some enhancements to the class Trace { public static string GetLastErrorFriendlyName() { return Marshal.GetLastWin32Error().ToString(); } [DllImport("kernel32.dll", SetLastError=true, CharSet=CharSet.Auto)] public static extern IntPtr LoadLibrary(string dll); static Trace() { LoadLibrary("enhanced_logging.dll"); } } It's not important what the static constructor does; the point is that we have a static constructor now. In this case, the static constructor tries to load a helper DLL which presumably does something fancy so we can get better trace logging, something like that, the details aren't important.
The important thing is that the constructor has a side effect.
Since it uses a p/invoke, the value of
Now let's look at what happens in our program.
First, we call
The static constructor tries to load the
After the static constructor returns, we return to our program already
in progress and call And that's why we get 126 instead of 6.
What's really scary is that problems with static constructors running
at inopportune times are often extremely hard to identify.
For one thing, there is no explicit indication in the source code that
there's any static constructor funny business going on.
Indeed, somebody could just recompile the assembly containing the
A side effect you might not consider is synchronization. If the static constructor takes any locks, you have to keep an eye on your lock hierarchy, or one of those locks might trigger a deadlock. This is insidious, because you can stare at the code all you want; you won't see anything. You'll have a method like class Trace { ... public static string GetFavoriteColor() { return "blue"; } }
and yet when you try to step over a call to
Another factor that makes this problem baffling is that
the problem occurs only the first time you call
a method in the "I'm sorry, did I call you at a bad time?" Footnotes³ ¹This is not strictly true. In reality, it's a bit of sleight-of-hand performed by the C runtime library.⁴ ²For a less ugly name, you can use this class instead: class Trace { [DllImport("kernel32.dll", SetLastError=true)] public static extern IntPtr LocalFree(IntPtr hlocal); [DllImport("kernel32.dll", SetLastError=true, CharSet=CharSet.Auto)] public static extern int FormatMessage(int flags, IntPtr unused1, int error, int unused2, ref IntPtr result, int size, IntPtr unused3); static int FORMAT_MESSAGE_ALLOCATE_BUFFER = 0x00000100; static int FORMAT_MESSAGE_IGNORE_INSERTS = 0x00000200; static int FORMAT_MESSAGE_FROM_SYSTEM = 0x00001000; public static string GetLastErrorFriendlyName() { string result = null; IntPtr str = IntPtr.Zero; if (FormatMessage(FORMAT_MESSAGE_ALLOCATE_BUFFER | FORMAT_MESSAGE_IGNORE_INSERTS | FORMAT_MESSAGE_FROM_SYSTEM, IntPtr.Zero, Marshal.GetLastWin32Error(), 0, ref str, 0, IntPtr.Zero) > 0) { try { result = Marshal.PtrToStringAuto(str); } finally { LocalFree(str); } } return result; } } Note that there may be better ways of accomplishing this. I'm not the expert here. ³Boring footnote symbols from now on. You guys sure know how to take the fun out of blogging. (I didn't realize that blogs were held to academic writing standards. Silly me.) Now you can go spend your time telling Scoble that he wrote a run-on sentence or something. ⁴Although this statement is written as if it were a fact, it is actually my interpretation of how the C runtime works and is not an official position of the Visual Studio team nor Microsoft Corporation, and that interpretation may ultimately prove incorrect. Similar remarks apply to other statements of fact in this article. Postscript: Before you start pointing fingers and saying, "Hah hah, we don't have this problem in Win32!"—it turns out that you do! As we noted in the introduction, static constructors run when the DLL is loaded. The granularity in Win32 is not as fine, being at the module level rather than the class level, but the problem is still there. If you use delay-loading, then the first call to a function in a delay-loaded DLL will load the target DLL, and its static constructors will run, possibly when your program wasn't expecting it. |
Comments (55)
Comments are closed. |
return new System.ComponentModel.Win32Exception().Message
> Boring footnote symbols from now on. You guys sure know how to take the fun out of blogging.
Apparently nobody noticed or complained when you used normal
footnote symbols yesterday. Why complain _today_ about your readers, or
get back to that topic anyway? (Note: that was a rhetoric question)
Numbered footnote marks are cool.
what .net needs, is more cowbell.
plenty of cowbell in System.Music.Instruments.Awesome namespace
How about a C# example without mucking it up with DllImports? Certainly you could display the effects of static constructors without needing to call all of those external methods.
Wow, just yesterday Eric Lippert blogged about the horrors of uncertainty in C++ (e.g., the result of ++i + i++ being undefined). Yet in C# there is uncertainty about the time of construction of a class–oh, and in the destruction as well, since it’s not cleaned up when you exit any scope but left to GC. Those issues seem much more horrifying and difficult to track down than sequence point problems in C++ expressions.
While I totally understand the reason you’ve needed to be complete, formal, and covering all the bases lately, I’m a bit sad that it’s come to that. It was more fun for everyone before the nitpickers and literalists made you cover everything!
Your blog is great reading! Thanks!
To paraphrase a motto from an online game that shall remain nameless:
“Write what you want and ignore the complaints.”
Or is it a requirement from MS legal department that you address every single issue?
This is why we can’t have nice things.
I love your recurring, albeit yearly, .NET stuff. Keep rocking, Raymond!
@Dave:
There’s no uncertainty about when static constructors execute. They execute the first time a class is used. The problem is that a lot of people don’t understand this, and they assume that the constructor gets run when the assembly is loaded.
I used to feel the same way you do about nondeterministic finalization (comming from a C++ background), but then I realized that it’s better that way. It means that the GC can optimize the finalization order. Additionally, there is a mechanism (the using keyword) in place that allows the programmer to explicitly control finalization when necessary.
I like fancy footnote symbols. They add to your uniqueness. Fight the man. Be passive aggressive against those who would push you into the mold of conformity. They are the ones that don’t realize what makes you a brilliant programmer is that your thinking is a bit "skewed" from normal. The crazy footnotes are just an indication of your true brilliance and ability to see the world from a different perspective.
And the finalizers are called by the GC, which means you don’t have to worry about them altering state, as in C++.
Nick, if you’re writing a robust program that you don’t want to end up stuck in the finalizer thread, you have to call Dispose (or use ‘using’) all over the place. Every time. With every resource that’s IDisposable.
Don’t have to worry about finalizers altering state? Only if the author of the finalizer is very rigorous. The level of discipline is no different from C++.
Some versions of FxCop could tell you when you hadn’t called Dispose on an object that implemented IDisposable going out of scope, but I understand they have removed/are removing the engine support for it as it was too hard to get right.
Best practice now is to only use a finalizer on an object that holds a handle to an unmanaged resource, and that should be the only purpose and responsibility of that class. There’s now a SafeHandle class in .NET 2.0 that gets you a long way towards your own implementation.
C++/CLI automatically calls Dispose at the end of an object’s scope, as if it were a destructor – no using blocks required. It maps ~Class() to Dispose(). If you really need a finalizer in C++/CLI, you write a !Class() method.
Well, that was a bit of disturbing information. Not having checked when C# static constructors get called, I sort of naively assumed they behaved like the C++ equivalents.
I wouldn’t say the C# behavior is a problem, not now that it’s been brought to my attention, but yeah, that was a bad bug just waiting to happen…
Thanks Raymond, I think that gem of info probably saved me a week of my life sometime down the road. :>
If nitpickers told you to jump off a bridge, would you do it?
One of the things I learned from this post and the comments is that it is very difficult being Raymond Chen.
I agree with Raymond regarding the footnotes and lament the loss of "character and whimsy". I really enjoy it when Raymond adds a little character and whimsy to his post!
“You guys sure know how to take the fun out of blogging.”
I am a little too tired of the constant whining you do about comments, When are you going to realize that it is not only ridiculous to try to please every random person who writes a comment here, but is mathematically impossible?
Do you really think that more than 2 people in the universe really care what kind of footnote markers you use? And why should anyone else’s opinion (on a trivial style preference) be of any concern to you?
Numbered footnotes using Roman numerals would be both cool and whimsy.
When footnotes are numbered, they are so much easier to look up compared to oddly-shaped ones. Everyone likes binary search.
I don’t know a dang thing about programming ’puters. I’m here as a footnoting style enthusiast.
Keep up the good work!
Maybe for those of who read Raymond’s technical details, a blog devoid of nitpickers, this is ridiculous: 9 of 23 comments have nothing to do with the technical details…
I’m not a fan of C#, there’s something smelly about it. Having said that I only really know what I learn here!!!! (Probably ought to go somewhere where I am likely to get a "Pro C#" picture! :)
Nevertheless, I can think of a scenario where this "on demand" approach has it’s advantages.
If you have a number of static objects greater than one, and one needs to reference one from another, then the order of initialisation can be wrong.
I’m not sure if there is a way to specify to the linker (I assume) what order any objects at global scope should be constructed, but when I have this problem I usually just frig the build. It’s not a regularly occuring problem to say the least, but it can be painful to find if you’ve not seen the code before, and is (I think) specific to the C++ approach. It would seem from this article that one at least has some mechanism of natural control over the order of initialisation.
Surely these are the two ends of the same problem?
Having said that, I think I prefer the C++ approach, merely because the number of instances of the problem are minimized by writing clean well structured code.
On the face of it, writing clean structured C#, would appear to introduce this problem all over the place.
Then again, I’m probably wrong! :)
As I recall, the equivalent feature in Java – the static block – works the same way; it gets run the first time the class is used. This isn’t surprising, since C# has more than a few similarities to Java.
You could bug the MSDN blog software guys to add something like wikipedia’s <ref> tag so you can get (clickable!) footnote numbering automatically:
http://en.wikipedia.org/wiki/Wikipedia:Footnotes
As for the footnote symbols, he’s still using little characters, they’re just not as whimsical :-)
Peter: The comments about Raymond whining about comments are drowning out everything else.
With all the disclaimers, Raymond might consider changing the byline of the blog from "Not actually a .net blog" to "Not actually a primary source".
I like the C# approach because its more scaleable. Suppose I were to create a library that contained 10 objects that each needed a static initialization. With the C++ approach anyone that used my library would have to pay the static initializations for all of the classes in my library, meaning I have to trade startup time for the convienence of static initializations. With the C# approach the static initializations for the objects that are never used are never run.
[quote]
I like the C# approach because its more scaleable. Suppose I were to create a library that contained 10 objects that each needed a static initialization. With the C++ approach anyone that used my library would have to pay the static initializations for all of the classes in my library, meaning I have to trade startup time for the convienence of static initializations. With the C# approach the static initializations for the objects that are never used are never run.
[/quote]
It seems no one read my comment :-(
Raymond,
I can’t be the only reader of your blog who’s just grateful that you provide an insight into *why* things are done the way they are, not just how things are done.
Thanks for continuing to write for us; it’d still be interesting (if harder to read) if you decided to use txtspk.
And I thought the problem of C++ calling *destructors* at the end of a function were bad enough.
For example:
http://blogs.msdn.com/ericlippert/archive/2004/03/23/94651.aspx
Actually, you should say "One of the differences between *Visual* C++…"
The reason is that according to the standard non-POD classes *may* be initialized before main but are certainly initialized before the first use (which has implications for threading – as I painfully found out recently.) The relevant portion of the standard refers to dynamic initialization.
So:
class BigFatConstructor
{
public:
BigFatConstructor(){ /* initialize resource a */ }
resource & getresource(){return a;}
private:
resource a;
}
static BigFatConstructor bfc;
…
void threadfn1()
{
bfc.getresource(); // uh oh…
}
int main()
{
new thread(threadfn1);
new thread(threadfn1);
new thread(threadfn1);
new thread(threadfn1);
}
In both C++ and C#, the general rule is that you should keep your static constructors to a minimum. That would certainly rule out doing P/Invoke inside a static constructor (and almost anything that can throw an exception is generally a bad idea).
If you follow that rule, the deterministic order of construction that the on-demand model provides is a pretty nice benefit.
(Warning: mention of a bug in a .Net implementation ahead! This is not a complaint, nor a whine for help. This is simply a report on a minor bug that relates to the topic of today’s blog entry. This disclaimer brought to you by the Nitpicker Defense Brigade.)
While doing some development under .NetCF, I discovered what *appears* to be a bug in static initialisation. I mention it here in case others are debugging .NetCF code and run into static strangeness.
There are two places where static members can be initialised: in the static constructor (.cctor), and in a static initialiser (static int foo = 1;). Ordinarily, the static initialisers are called just before the static constructor runs, so they obey the same "on demand" rules described by Raymond. However, under the Compact Framework (in the version I was using at the time, 1.0 I believe), static initialisers in referenced assemblies are not called at all. That is, if your app loads an assembly containing a class with static members (with initialisers) and a static constructor, the static constructor will be called on the first use of the class, but the initialisers won’t.
I haven’t worked in .NetCF for some time, so I don’t know if the bug still exists in recent releases.
The code I was working on when I discovered the problem is here: http://www.codeproject.com/netcf/CfResortCompanion.asp
The C# method has another benefit. In C++ static constructors are process-wide and could be called in any application-specific security context. In C# static constructors are AppDomain-wide (and you *can* get a particular static constructor called more than once in a process) and if your application-specific security context is sandboxed with an AppDomain you don’t leak from one context to another.
C# and Java always seem to me to be something of a con. The original idea was that programmers don’t need to be concerned with object lifetimes since the garbage collector will take care of it. And some of the time it’s true, but there are loads of cases where you actually need to do more work to get a stable application. And constructors, destructors and exceptions make it harder to do that work.
C++ is ok if you use it sensibly, but I’ve spent ages debugging non sensible C++ designs on embedded systems where you only have printf and that slows things down to the point where they work.
It seems like bad programmers will write code with a bunch of subtle timing problems in any language. But in C it’s much easier to debug and patch than in C++ where so much more happens behind the scenes.
Problems like this are the exact reason I prefer to use C and assembly over C++ and managed languages.
They are the cause of a large class of bugs stemming from things that do alot more than you expect them to, and things that interact poorly with each other. Edge cases like this one in particular are horrible to debug.
Damn, beaten to it by Dopefish, although I was favouring “not actually an official Microsoft documentation blog”.
All comments are moderated, right? Why not just silently delete/drop
comments that are nit-picky, off-topic or in some other way a
detraction from the overall post?
In fact, please do just that, and go back to writing /your/ blog the
way /you/ want to. Stop trying to please everyone else, and focus on
making yourself happy.
And if someone is unhappy or upset that their comment didn’t make
the grade, tough. They can go rant about it on their own blog if they
have to. Clicking the Submit button at the bottom of this form does not
automatically grant one the right to be heard.
grief for doing so. Deleting comments I don’t like may lead people to infer that I
approve of the comments that remain, and it creates potential legal liability. (That specific case was tossed out on a technicality, so the underlying legal issue remains unresolved.) -Raymond]
Of course GetFavoriteColor() won’t work; you gave the wrong answer! You’re lucky an exception wasn’t thrown straight into the Gorge of Eternal Peril. *ducks*
For even more whimsy numbered footnotes, there are two sets of
circled numbers (U+2460—U+2473 and U+2780—U+2789), two sets of inverted
circled digits (U+2776—U+277F and U+278A—U+2793), a set of
parenthesized numbers (U+2474—U+2487), a set of numbers with periods
(U+2488—U+249B), and a set of fullwidth digits (U+FF10—U+FF19), as well
as two sets of Roman numerals (U+2160—U+216F and U+2170—U+217F). Quite
enough for not being boring.
&x2461; It’s the numbers I object to. You
can decorate the numbers in whimsical ways, but they’re still numbers.
-Raymond]
A big part of the problem in the first example is the communication covertly via side-effects on globals, as implied by GetLastWin32Error(). If SetEvent() returned an error code explicitly, it couldn’t be clobbered by the unexpected static constructor.
@Yuhong Bao: the C++ destructor is one of the reasons why resource management is in many cases easier in C++ than in so-called managed languages. Objects can actually be wholly responsible for the resources they allocate.
The example link you posted doesn’t really demonstrate a problem with destructors per se so much as it demonstrates a problem with global variables (which is what SetLastError/GetLastError essentially are, though they are thankfully local to the thread they’re called on).
That’s not to say the link doesn’t describe a genuine and ugly problem, but the problem is consistent with the well-known issues with globals.
P. Ritchie: "In C++ static constructors are process-wide"
No, they are module-wide (*.exe, *.dll).
E.g. if your program.exe and two *.dlls you use in it all link with library l.lib, and you have static object in l.lib, you’ll have three instances of them and three constructors called.
You must have l.dll and l.lib (i.e. dynamic linking) to get "process-wide" effect (but then it goes under "module-wide", module being l.dll).
> when I try to add character and whimsy, people complain
When people write comments with "character and whimsy" *you* complain, which merely increases antipathy towards your "style".
Apply whimsy to numbers: use them in an unordered manner. That would also answer your objection to having to renumber them.
And of course it sounds so obvious when Raymond explains it, but then you think of the poor guy behind the debugger who’s got no idea of the oncoming freight train.
So how long did it take to figure out the original problem?
(I hate all these kinda-sorta "C-derived" languages. Especially Java)
Um, the Win32 comment…. perhaps I’m wrong because I would have expected others to pick up on this but they didn’t so at the risk of sounding foolish…
Surely with dyna-load DLL’s, static initialisation occurs at the point at which the DLL is loaded.
Since my code is loading the DLL, presumably having checked that it isn’t already loaded, so my code also knows that initialisation of that DLL is going to occur right here, right now.
vs
I’m accessing an instance of a class. Is this the first time I’ve ever accessed any instance of that particular class in this session? (gazes into tea leaves at bottom of cup)
Jolyon: Not with delay-load DLLs: "your" code is not the one the who loads the DLL, it’s just loaded the first time you use it. This is different to calling LoadLibrary/GetProcAddress however.
There’s a wee bit more to static constructors than you think, the following goes into all the gory detail: http://www.yoda.arachsys.com/csharp/beforefieldinit.html Note that it’s actually possible to use a class before the static constructor is called, it’s all in the implicit/explicit declaration ;-)
—————-Quote—————–
Ordinarily, the static initialisers are called just before the static constructor runs, so they obey the same "on demand" rules described by Raymond. However, under the Compact Framework (in the version I was using at the time, 1.0 I believe), static initialisers in referenced assemblies are not called at all.
If I understand what you are saying, this is not a bug; rather this is an optimization, where fields are not initialized before a static method is called. Look up the BeforeFieldInit attribute, it is a very important (but subtle) point regarding static initialization.
The only way to disable this behavior is to specify a static constructor (even an empty one). The static constructor guarantees that fields will be initialized first, then the constructor will run, and only then will your static method be called.
Hello,
I usually don’t read comments, simply because they are not part of the RSS feed. But this time I went to read every piece to find out, whether I am the only one who is really enjoying your footnotes symbols. There are some as well, but not as much as I expected.
Regardless the way you list your footnotes, be it Tamil numbers or whatever, this is just to let you know that I liked your symbols and without them it will just…be not so fun.
Have a nice day!
Can anyone please explain how the c++ runtime can call static functions in dlls (before DllMain). I’m curious and have never seen any description of this "work-a-round".
"dll != lib" — When you write a C++ DLL, you don’t actually write the DllMain export that ends up in your DLL’s export table.
The DllMain address from teh export table is inside the C++ runtime; when it gets an attach notification, it calls your static constructors, then calls the DllMain-lookalike that you wrote.
(At least, that’s my understanding. It may be wrong.)
"The example link you posted doesn’t really demonstrate a problem with destructors per se so much as it demonstrates a problem with global variables (which is what SetLastError/GetLastError essentially are, though they are thankfully local to the thread they’re called on)."
Indeed, a global variable would have the same problem.