Date: | March 22, 2005 / year-entry #72 |
Tags: | code |
Orig Link: | https://blogs.msdn.microsoft.com/oldnewthing/20050322-00/?p=36113 |
Comments: | 44 |
Summary: | Often you'll be minding your own business debugging some code, and you decide to step into one function and the debugger shows that you're in some other function. How did that happen? class Class1 { public: int *GetQ() { return q; } private: int *p; int *q; }; class Class2 { public: virtual int GetValue()... |
Often you'll be minding your own business debugging some code, and you decide to step into one function and the debugger shows that you're in some other function. How did that happen? class Class1 { public: int *GetQ() { return q; } private: int *p; int *q; }; class Class2 { public: virtual int GetValue() { return value; } private: int value; }; You then step through code that does something like this: int Whatever(Class2 *p) { return p->GetValue(); }
And when you step into the call to What happened is that the Microsoft linker combined functions that are identical at the code generation level. ?GetQ@Class1@@QAEPAHXZ PROC NEAR ; Class1::GetQ, COMDAT 00000 8b 41 04 mov eax, DWORD PTR [ecx+4] 00003 c3 ret 0 ?GetQ@Class1@@QAEPAHXZ ENDP ; Class1::GetQ ?GetValue@Class2@@UAEHXZ PROC NEAR ; Class2::GetValue, COMDAT 00000 8b 41 04 mov eax, DWORD PTR [ecx+4] 00003 c3 ret 0 ?GetValue@Class2@@UAEHXZ ENDP ; Class2::GetValue
Observe that at the object code level,
the two functions are identical.
(Note that whether two functions are identical at the object code level
is highly dependent on which version of what compiler you're using,
and with which optimization flags.
Identical code generation for different functions occurs with very
high frequency when you use templates.)
Therefore, the linker says,
"Well, what's the point of having two identical functions?
I'll just keep one copy and use it to stand for both
0:000> u Class1::GetQ 010010d6 8b4104 mov eax,[ecx+0x4] 010010d9 c3 ret 0:000> u Class2::GetValue 010010d6 8b4104 mov eax,[ecx+0x4] 010010d9 c3 ret
Notice that the two functions were merged: The addresses are
identical.
That one fragment of code merely goes by two names.
Therefore, when the debugger sees that you've jumped to
That's why it looks like you jumped to the wrong function.
To disable what is called "identical COMDAT folding", you can pass
the |
Comments (44)
Comments are closed. |
Wow. I thought I’d seen some interesting behavior, but I’ve never seen that one. Thanks Raymond!
And this feature is what makes C++ templates a viable solution for applications.
Otherwise templates would cause sufficient explosion in code size that they’d be almost unusable for production software.
This doesn’t just affect debugging. It also affects sampling profilers for the same reason. So, when doing sampling profiling, you may want to throw the /OPT:NOICF switch to avoid getting time attributed to the wrong functions.
Oh, never mind, I found the answer: it crashed because the function pointers were compared elsewhere in the program. Do more recent versions of the compiler recognise this situation?
There was a problem with MSVC6’s ICF that caused Browser X‘s JavaScript engine to crash when compiled as a release mode DLL.
From your description, I don’t see how such a thing could happen. Surely, if functions are truly identical, there’s no way this could cause a crash?
A little bit off-topic: Why does the VisualStudio 2003 has sometimes a very short list of entries inside the call-stack?
the bug was presumably fixed in the Browser X code, I don’t suppose the compiler can detect in the general case whether the address of one of the functions that has suffered from ICF is being compared to another. (And it would be too heavy-handed to disallow ICF on functions that have just had their addresses taken.)
I found the same bug in Program Y, when porting it to a new system — there was an assert to ensure you didn’t add a certain callback twice, which was exactly what happened once the linker had been at it…
The bug was "fixed" by disallowing ICF for that particular source file.
a way for the linker to avoid folding identical functions for which different addresses are needed would be to insert NOP or jump instructions in front of the common body, and use the address of the 1st NOP instruction (or a jump instruction). This way for large enough duplicated methods, you’d still get separate identities and code factoring.
“I hope their nose hairs are still on fire.
Frickin’ Browser X retards. I’m suppressing the urge to rant, but it’s really difficult.”
Oh! The hypocrisy!
Gene: ISTR that it’s used as a type check in the JS compiler. It’s somewhat poor design, but it’s perfectly legal. The C standard specifies that pointers to different functions will compare unequal. Identical COMDAT folding breaks this, so it should not be enabled by default. Unfortunately the Visual Studio IDE does enable it in new C/C++ projects.
Tom_ wrote: “the bug was presumably fixed in the Browser X code … I found the same bug in Program Y”
Comparing function pointers is not a bug. The bug is in the build settings that enable ICF for such code.
"The C standard specifies that pointers to different functions will compare unequal."
Just because it’s in the standard doesn’t mean it’s good practice. The c++ standard says I can insert elements at the beginning of a vector<T>, but no one in their right mind would do that regularly.
A good callback implementation would include a context parameter. Not only does this allow you to create callbacks to object methods, but it would allow the registration function a legitimate way to detect double registrations.
Ben: When I run the C compiler in ANSI-compliant mode (/Za), the identical functions are not folded.
A trivial,
CLR had COMDAT Folding turned off, because when in debug it needs to compare function pointers(As I remembered).
The code has been re-written since, and CLR has COMDAT turned on now. As a result, the size of mscorwks.dll shrinks by several hundred KBs.
It seems like it would be trivial to record whether the address of a function is taken, or exposed as a exported function. If either of these conditions are true, then the function would be considered ‘unfoldable’
At least historically the debugging symbols have included only the filename not the full path to a source file.
In at least one instance Ive debugged into the wrong function, because two cpp files, in different parts of the project tree, had the same name.
It’s less trivial when you realize that the .c file that defines the function need not be the same .c file that takes its address. Therefore the information would have to be passed in the .obj file and reconciled by the linker. Not that it can’t be done, but it’s not trivial either.
Perhaps it isn’t trivial to figure out if any function pointer address is referenced, but what I’m getting at is that compilers have solved what would seem to be much more difficult hurdles already. Take a look at the gymnastics behind template code generation, making sure that template code is only used once and doesn’t result in ugly linkage errors. If that can be resolved, then I’d imagine so could this issue.
Indeed, it is precisely this feature (COMDAT folding) that makes templates work!
Even if a debugger is not able to tell which of the folded functions is being run, couldnt it detect that the function is folded and at least give the user the option to look at the other ‘foldees’?
Gene wrote: "The results of comparing pointers are undefined by the standards and up to the compiler …"
The result of testing equality of pointers, including function pointers, is defined in both C and C++. See ISO 9899:1999 6.5.9/6 and ISO 14882:2003 5.10/1.
Raymond wrote: "Indeed, it is precisely this feature (COMDAT folding) that makes templates work!"
That’s wrong on two levels. Template instantiation can be implemented in several different ways, though most current implementations use "greedy instantiation", i.e. generating code in each translation unit that needs it. This requires the basic COMDAT functionality (linker accepts multiple sections defining the same symbol and resolves all references to just one of them) but not ICF (linker merges sections with the same contents).
True, but notice that COMDAT doesn’t require one obj file (the one that takes the address of a function) to "cancel" another one (the one that generates the code for the function with the flag ‘it’s okay to merge me’). It is this "cancellation" that makes things nontrivial.
Raymond wrote: "When I run the C compiler in ANSI-compliant mode (/Za), the identical functions are not folded."
At least in VC++ 7.1, /Za has nothing to do with it. ICF is enabled whenever optimisations are enabled (either /O1 or /O2), whether or not /Za is used.
Okay, I haven’t written a compiler in a long time. Maybe compilers are smarter now.
I’d assume the things they’ve added post VC6 to the linker to make WPO work (specifically, inlining functions defined in different object files) would make marking functions that have their address taken easy. I mean solving this type of problem is almost the same as making sure a function is removed from the EXE if its address never taken and/or gets completely inlined away. You’d have the same types of problems with functions exported from DLLs.
It sounds like enabling COMDAT folding would have a performance impact, especially in large projects, due to cache misses from having to jump to possibly-distant locations for even trivial functions. Has anyone done tests in large CPU-bound, template-heavy applications to see if this does has an impact on performance, and if so, what the impact was?
I don’t think distance is relevant for caching on x86, just that portion of the address (bottom 12 or 13 bits I think?) that is used to determine the cache slot entry.
(I guess you could end up with potential problems due to ICF, but it would be by accident, and if you’re relying on not getting that behaviour then you’re on shaky ground by relying on that sort of linker option to enforce it.)
Enabling COMDAT folding is good for performance, because it decreases your overall code footprint and increases the chance that code you need will be in the processor cache and or indeed physical memory. Hitting the pagefile is very expensive.
It’s also a useful optimization for constrained memory situations, where overall footprint is critical (e.g. Xbox, or handheld devices which do not have a pagefile).
The precise gains vary based on your situation, but in template-heavy code it can be significant.
Can you give a non-contrived example of how COMDAT folding leads to significant size reduction of templated code?
I had a feeling you would say something like that but I have a gripe with it. My idea is that that for COMDAT folding to be effective, the functions (and template parameters) have to be really trivial to generate identical code. Most, if not all, of these functions are trivial inline functions like your example (unless the author of the code is so braindead as to not make them inline but then WPO would take care of that these days). The code for AddRef/Release (I’m assuming not calling them is an oversight, because they wouldn’t even be compiled at all given your code above) would be generated in the object files just in case its address gets taken, or it needs to reference it in another translation unit, etc. but the code should be completely eliminated from the final EXE/DLL after linkage. If this weren’t the case, then modern libraries that rely heavily on recursive templates or dummy classes like boost or any other template metafunction code would be really huge and I highly doubt Visual Studio handles inline template functions that stupidly.
I also have a similar gripe with the way .NET generics are advertised as being less bloated than C++ templates because of code sharing of reference types, but that’s another rant.
asdf – sure.
template <typename T>
class RefCount
{
T* m_pObj;
DWORD m_refcount;
public:
RefCount(T* pType) : m_pObj(pType), m_refcount(1)
{
}
DWORD AddRef() { return m_refcount++; };
DWORD Release() { if (–m_refcount == 0)
{
delete m_pObj; m_pObj = NULL;
}
return m_refcount;
}
}
void Start()
{
A* pA = new A;
B* pB = new B;
RefCount<A> a(pA);
RefCount<B> b(pB);
}
——————————–
Without COMDAT folding:
You end up with a RefCount<A>::AddRef, a RefCount<B>::AddRef, a RefCount<A>::Release and a RefCount<B>::Release.
With COMDAT folding:
You end up with a RefCount<*>::AddRef, and a RefCount<*>::Release.
(At least, that’s how I think it works – I have a cold, so I may be wrong).
asdf:
You don’t have to take anyone’s word for it: try it yourself. Take a large project, with plenty of template code, and then compile/link with and without COMDAT folding and compare the difference. It’s significant (I’ve seen double-digit percentage gains many times).
Pointers to virtual functions all look the same.
Normally, AddRef and Release are virtual functions and cannot be inlined. In that case, folding duplicate functions *is* effective.
It’s interesting to see that the Windows kernel uses the exactly opposite principle: a large number of functions are forced to be inlined. When looking at the disassembly, you see the same old "faces" over and over again (e.g. you learn pretty quickly to recognize KeLeaveCriticalRegion – the call to HalRequestSoftwareInterrupt(APC_LEVEL) gives it away). Same goes for interrupt handlers: all of them share most of the code, yet they are kept distinct
There’s obviously a balance. Some functions are small enough that it’s better to inline them even if it means duplication, while others are not. I can’t speak to the particular compiler settings and inlining choices made in the kernel, since I’m not a kernel developer.
Ben wrote: "The result of testing equality of pointers, including function pointers, is defined in both C and C++. See ISO 9899:1999 6.5.9/6 and ISO 14882:2003 5.10/1."
According to the (draft) C standard copy I have:
"Two pointers compare equal if … both are pointers to the same function."
My reading of above is that the reverse is not necessary true. It seems to know the difference of "if" and "if and only if" (see 6.5.9/4).
According to 6.5.8/in C:
"When two pointers are compared, …(standard specified comparison behavior deleted). In all other cases, the behavior is UNDEFINED."
And according to 5.9/2 in C++, the result of comparison of pointers to different functions is UNSPECIFIED.
So in both languages, the compiler seems to have the right to do whatever it sees fit if the program tries to "compare pointers to different functions".
Isaac Chen
The sections you quoted deal with relational operators, Ben was talking about the equality operators: "Two pointers of the same type compare equal if and only if they … both point to the same function".
I forgot to mention the related paragraphs:
6.5.9/3 in C
"The == (equal to) and != (not equal to) operators are analogous to the relational operators except for their lower precedence."
5.10/1 in C++
"The == (equal to) and the != (not equal to) operators have the same semantic restrictions, conversions, and result type as the relational operators except for their lower precedence and truth-value result."
I didn’t see the "only if" part in Ben’s message and the (draft) standards regarding these operators.
Here’s another fun one from the internal support alias.
When you profile with the VS Profiler, you…