Date: | January 20, 2006 / year-entry #28 |
Tags: | code |
Orig Link: | https://blogs.msdn.microsoft.com/oldnewthing/20060120-00/?p=32583 |
Comments: | 15 |
Summary: | There is no requirement that it does, and it often doesn't. |
Although the diagrams I presented in my discussion of The layout of a COM object place the vtable at the beginning of the underlying C++ object, there is no actual requirement that it be located there. It is perfectly legal for the vtable to be in the middle or even at the end of the object, as long as the functions in the vtable know how to convert the address of the vtable pointer to the address of the underlying object. Indeed, in the second diagram in that article, you can see that the "q" pointer indeed points into the middle of the object. Here's an example that puts the vtable at the end of the object: class Data { public: Data() : m_cRef(1) { } virtual ~Data() { } LONG m_cRef; }; class VtableAtEnd : Data, public IUnknown { public: STDMETHODIMP QueryInterface(REFIID riid, void **ppvOut) { if (riid == IID_IUnknown) { AddRef(); *ppvOut = static_cast<IUnknown*>(this); return S_OK; } *ppvOut = NULL; return E_NOINTERFACE; } STDMETHODIMP_(ULONG) AddRef() { return InterlockedIncrement(&m_cRef); } STDMETHODIMP_(ULONG) Release() { LONG cRef = InterlockedDecrement(&m_cRef); if (!cRef) delete this; return cRef; } }; The layout of this object may very well be as follows: (Warning: Diagram requires a VML-enabled browser.)
Observe that in this particular object layout, the vtable resides at the end of the object rather than at the beginning. This is perfectly legitimate behavior. Although it is the most common object layout to put the vtable at the beginning, COM imposes no requirement that it be done that way. If you want to put your vtable at the end and use negative offsets to access your object's members, then more power to you. |
Comments (15)
Comments are closed. |
Which browsers are VML enabled? I don’t know if I’m seeing the diagram correctly or not… and it’s not as if Safari puts "VML Enabled!" on the splashscreen when it starts up.
Why does the address bar icon for this blog say ".net"? At least in Firefox. Isn’t this the pre-dotNet place?
In fact, when implementing interfaces, the instance’s pointer to the virtual dispatch table cannot go at the start of the class.
When you cast an object instance to one of the interfaces it implements, the pointer you get back isn’t a pointer to the start of the object, because different interfaces will expect different functions in the vtable (two indirections from the instance pointer gets you to the start of the vtable).
Every class that implements an interface needs to create a vtable corresponding to that class-interface pair, and inside that table there are pointers to thunk functions which adjust the ‘this’ pointer coming in to point to the ‘real’ class (as opposed to interface) vtable, at the start of the object, before jumping to the start of the appropriate implementation of the interface method for that class.
How to handle this in your code without your head blowing up:
– C++? you’re lucky. The compiler does everything for you. Example: if you implement two interfaces whose inheritance trees are disjoint, you will find yourself with two distinct IUnknown interfaces. The first will be allocated at the beginning of the object, and its members will access the object fields normally, and the other will actually point to so-called "adjustors", stub members which decrement "this" by the right amount and call the one true implementation
– C? don’t worry! there’s a wonderful macro in the PSDK called CONTAINING_RECORD, which, given an inner pointer to an object, the object’s type and the field the pointer points to, returns the object pointer, hiding all the horrible casting and pointer arithmetic. You can use it to implement "adjustors" by yourself
In the MFC hierarchy, the root COM capabable object is CCmdTarget which is derived from CObject. Which means that the vtable is offset from the beginning of the object. The way MFC does COM is to have a buch of nested inner objects each with an IUnknown implementation. MFC has a macro (using offsets) that provides a pointer to the containing object.
BTW, you have a very informative blog.
"Why does the address bar icon for this blog say ".net"?"
The icon is for the entire blogs.msdn.com domain.
Shouldn’t the terminology used here be "vtable pointer"? The vtable itself is not contained in the object (multiple objects of the same class all use the same vtable). Rather, each object has a pointer (or a set of pointers) to the vtable(s) for the class.
Your post reminds me of something I’ve always wondered about. I agree with mikeb that we should call it the vtable pointer, too.
ASSUMPTION: Suppose I actually had my vtable (not just the pointer) built into the object’s instance data. I obviously can do that, it is just an array of pointers to function entries according to the COM binary standard.
OPPORTUNITY: This leaves open the possibility of making the vtable state-full and having methods that update the vtable based on whatever is going on. This is an optimization case for moving state into the pointers rather than having methods do lots of case statements and other logic to get to the behavior that corresponds to the current state.
QUESTION: The question is, are compilers allowed to optimize all the way down to the function pointer so that a change in state might not be noticed? That is, altering a vtable entry a side-effect that the optimization could cause to be missed based on a compiler-implementation assumption that the vtable is invariant?
NEXT QUESTION: If the answer to that question is "yes," is there a way to specify volatility of the pointers in the vtable? I’m not sure where to look to find this.
QUESTION AFTER THOSE: If the answer to the first question is "no" (or there’s a volatility hint that can be made to the compiler), the next question is how about having the function pointer in the vtable point to custom code that was built into the object-instance data (directly or indirectly). I figure this would fire off DEP, and that would be a pain, but I had to ask.
OPPORTUNITIES: There are situations in applicative-/functional-programming where those are all great implementation optimizations. Hmm, I guess the CLR would not be a great place to try this [;<).
—
QUESTION: The question is, are compilers allowed to optimize all the way down to the function pointer so that a change in state might not be noticed?
—
C++ compilers are allowed to eshew vtbl and call the function directly if the compiler knows exactly which type does the object have. Aggressive optimization might cache even the function pointer. G++ caches the vtbl pointer, msvc does no caching.
—
NEXT QUESTION: If the answer to that question is "yes," is there a way to specify volatility of the pointers in the vtable? I’m not sure where to look to find this.
—
No, there is no way to specify this. The point of strictly typed languages like C++ is that the types are immutable at runtime. You could emulate this sort of behaviour of course, but this defeats the whole purpose of a strictly typed language. I would venture that you need to modify your design, probably two different levels of abstraction are conflated together.
orcmid, with regards to your comment:
"
This is an optimization case for moving state into the pointers rather than having methods do lots of case statements and other logic to get to the behavior that corresponds to the current state.
"
I think what you are trying to achieve would be better implemented with the State pattern rather than hacking the v-table. The book "Refactoring to Patterns" has an excellent discussion of contrasting inheritance with the state pattern. Look for their discussion comparing when to "refactor to base class" versus "refactor to State pattern".
If you explore implementations of the state pattern, you’ll find alternatives to the switch/case type of implementations. Similar to having a modifiable vtable pointer, you can have a pointer to a "state" object. The state object is a derived from a state base class, with virtual functions overloaded to modify how objects in that state respond to particular events.
Implementing the state patten by hacking the v-table would be an interesting language trick, but it would significantly decrease the maintainability of the code.
How many people actually went to the trouble of writing a COM object in C?
"Which browsers are VML enabled?"
Actually the way it is done, it’s "behavior" enabled browser + a VML implicit behavior. The first is still not valid CSS, for the second, VML is a proposition which has been abandonned in favor of SVG. So basically, some abandonned web browsers are VML enabled.
Alex: I’ve written an IDispatch in Assembly. Ml was the only free Microsoft compiler for a while :-)
All the VML does is to make an arrow between the p and IUnknown.vtbl and QueryInterface
It is an old alternative for graphics that MS tried to make a standard but failed. IE is pretty much the only browser that does it. Kind of like NN4 and JSSS (Netscape’s version of CSS)
schwiet:
Well, the pattern is the same either way, at an abstract level. I think the difference here is one between aggregation and containment, and there are two engineering considerations:
1. We’re talking about a COM interface, so it has a well-defined storage structure and I don’t think that setting function pointers and building vtable is quite all that bad — not obvious but it can be done in a way that is clean and maintainable.
2. We’re out to save the cost of copying parameters and stack deepening for the trivial case of relaying into different implementations of the same method, all with the same signature. [Keeping the this-pointer straight might be a challenge, I’ll grant you that, but my experience is that is always a challenge. That may be the greatest barrier to easy implementation of the alternative methods and I haven’t tested the trade-offs there.] I’m also out to save the cost of multiple instances and dynamic constructor-execution to accomplish these simple variations.
Having said all of that, I think the state pattern is something useful to keep in mind, especially when we’re talking about more complex state-management cases.
Anton Tykhyy:
[My wife is a potter. I must show her the tea set on your site. I know she’ll love it.]
Type invariance is being preserved. The signatures of the methods are not altered and (praise be) COM interfaces don’t do generics. The type doesn’t change, the implementation of the method changes.
Since, with COM interfaces, it is not permissible (and usually not possible) to guess the base class (or any class) from the interface, and the methods are all virtual, the compiler can’t short-circuit the pointer chasing.
I am not sure how to confirm that VC++ won’t cache the function pointer though. I guess I’ll have to just try it and see what happens.