Date: | March 26, 2007 / year-entry #105 |
Tags: | code |
Orig Link: | https://blogs.msdn.microsoft.com/oldnewthing/20070326-00/?p=27503 |
Comments: | 29 |
Summary: | Commenter Mike Petry asked via the Suggestion Box: Why can you dereference a COM interface pointer and pass it to a function with a Com interface reference. The call. OutputDebugString(_T("IntfByRef::Execute - Begin\n")); BadBoy badone; CComPtr |
Commenter Mike Petry asked via the Suggestion Box:
You already know the answer to this question. You merely got distracted by the use of a COM interface. Let me rephrase the question, using an abstract C++ class instead of a COM interface. (The virtualness isn't important to the discussion.) Given this code: class Doer { public: virtual void Do() = 0; }; void caller(Doer *p) { stupid_method(*p); } void stupid_method(Doer& ref) { ref.Do(); } How is this different from the pointer version? void caller2(Doer *p) { stupid_method2(p); } void stupid_method2(Doer *p) { p->Do(); } The answer: From the compiler's point of view, it's the same. I could prove this by going into what references mean, but you'd just find that boring, but instead I'll show you the generated code. First, the version that passes by reference: ; void caller(Doer *p) { stupid_method(*p); } 00000 55 push ebp 00001 8b ec mov ebp, esp 00003 ff 75 08 push DWORD PTR _p$[ebp] 00006 e8 00 00 00 00 call stupid_method 0000b 5d pop ebp 0000c c2 04 00 ret 4 ; void stupid_method(Doer& ref) { ref.Do(); } 00000 55 push ebp 00001 8b ec mov ebp, esp 00003 8b 4d 08 mov ecx, DWORD PTR _ref$[ebp] 00006 8b 01 mov eax, DWORD PTR [ecx] 00008 ff 10 call DWORD PTR [eax] 0000a 5d pop ebp 0000b c2 04 00 ret 4 Now the version that passes by address: ; void caller2(Doer *p) { stupid_method2(p); } 00000 55 push ebp 00001 8b ec mov ebp, esp 00003 ff 75 08 push DWORD PTR _p$[ebp] 00006 e8 00 00 00 00 call stupid_method2 0000b 5d pop ebp 0000c c2 04 00 ret 4 ; void stupid_method2(Doer *p) { p->Do(); } 00000 55 push ebp 00001 8b ec mov ebp, esp 00003 8b 4d 08 mov ecx, DWORD PTR _p$[ebp] 00006 8b 01 mov eax, DWORD PTR [ecx] 00008 ff 10 call DWORD PTR [eax] 0000a 5d pop ebp 0000b c2 04 00 ret 4 Notice that the code generation is identical. If you're still baffled, go ask your local C++ expert.
Mind you, dereferencing an abstract object is highly unusual
and will probably cause the people who read your code to
scratch their heads, but it is nevertheless technically legal,
in the same way it is technically legal to give a function
that deletes an item the name |
Comments (29)
Comments are closed. |
I use exactly this technique with C++ abstract classes all the time. This is because, to me and my team at least, passing by reference in this way does not imply a transfer of ownership, whereas passing by pointer typically does. This leaves the reader in no doubt that the method call will not delete the object. This is reinforced by the fact that "delete &ref" just looks wrong, so hopefully no-one would do it.
I would never do it with COM interfaces though, although the same logic could be used.
To Stewart:
Transferring ownership using a "raw" pointer is normally a bug; to correctly transfer ownership, pass a "smart pointer object".
To me, the decision between pointer and reference should be based on the question whether NULL should be able to be passed.
I’ve argued with a number of people (including instructors) who somehow think that references are not pointers but rather just introduce another name for an object. :(
Meanwhile, even though this produces exactly the same code in both cases (here and in probably every other implementation), I believe it’s technically not guaranteed to work in C++ if the pointer is null. (and I don’t just mean this particular example… If stupid_method did not even touch its arguments, this would still be true.)
My argument is not just technical.
Using a pointer signals to the humnan reader that NULL may be passed. Using a reference signals her that NULL is not going to be passed.
I don’t think dereferencing an abstract object is weird. Polymorphism works on pointers and on references. So why should this be a problem?
It’s just that the C++ syntax forces you to write ‘*pPointer’ to pass the object as a reference.
References are just a way to introduce object semantics instead of value semantics, without using a pointer.
And yes you can also do this on null pointers, which of course can result in disaster.
I’ll also put my vote in the "not weird" camp. I think that using references in this context is often semantically superior, mostly due to the reasons already cited.
I think Ray Trent’s comments, about the functions "contract" with the caller, are key. Putting a burden on the caller, explicitly, is very useful.
In complex environments, it may be hard to tell if a pointer has already been vetted. This is so because there’s no way to contractually communicate to a callee that a pointer has been vetted. This typically leads to duplicate checks for NULL, etc., throughout a call chain. If using references, the caller is contractually obligated to provide valid data. This removes the need for pointer checks in the callee.
I’m also a big fan of explicit ownership/responsibility in function contracts. References, like the "const" keyword, tighten up the contract, and I use both judiciously.
> I’ve argued with a number of people (including instructors) who somehow think that references are not pointers but rather just introduce another name for an object.
Conceptually, these people are right. Technically, of course, references are always implemented as pointers.
> I believe it’s technically not guaranteed to work in C++ if the pointer is null.
Indeed.
int *p = 0;
int &r = *p;
This is undefined behaviour according to the standard. The case is even explicitly mentioned somewhere.
To amplify that last slightly: Using a reference signals that, absent a compiler bug, stack overflow, etc., NULL *cannot* be passed.
In fact, it also signals that (absent those conditions again) the reference will always "point to" a *valid* object (at the time of the call).
Pointers have none of those guarantees, and thus are easier to misuse, but passing NULL is a valuable signal to a function that an object is not applicable or invalid. Of course you could always override the function with one fewer parameters and have the two functions call a private pointer-taking function, and at least get a guarantee that the pointer will *either* be NULL or valid).
^ Jesus… what people do and get paid, are there no quality standards today?
I agree. Anyone who can’t tell the difference between a "null" reference and a reference to an object located at address zero has no standards at all. ;)
"Null reference" and "pointer to object at address zero" are not synonyms, and there are machines where an object at address 0 is valid and usable.
But the typical use of that idea in code designed for an x86 machine and used to indicate a reference to an object that the developer knows is not valid is still a really bad idea.
Yeah, references should NEVER be NULL. I scream inside each time I see the code like this in one of the components we use:
extern void DoStuff(int a, int b, int c = 0, object &r = *(object *)NULL);
No, I am not kidding. They apparently REALLY wanted to add reference parameters to functions which already had optional parameters. This then requires that they check the address of the parameter later to see if it’s NULL.
"Conceptually, these people are right."
It may be a valid way to think about references in some contexts, but I think it’s dangerous. I could easily see it biting someone who doesn’t completely understand object lifetimes.
You forgot one very important and subtle difference:
void stupid_method2(Doer * const p)
{
p->Do();
}
Using a ref denies the called routine from modifying the pointer
Anony Moose:
Bjarne Stroustrup doesn’t agree with you in his FAQ (http://www.research.att.com/~bs/bs_faq2.html#null)
"Should I use NULL or 0?
In C++, the definition of NULL is 0, so there is only an aesthetic difference."
josh:
"It may be a valid way to think about references in some contexts, but I think it’s dangerous. I could easily see it biting someone who doesn’t completely understand object lifetimes."
I disagree. Dangling pointer vs. dangling reference <=> potato vs. potato. Are you saying that someone will think that holding a reference will make underlying object alive? Well, that someone must know it’s craft. No excuse for that in my book ;-)
To Thomas,
Good point, and in an ideal world one would use a smart pointer for this. Sadly, the question of which is the hard part.
Of the boost ones, only boost::shared_ptr allows transfer of ownership, and thats a big fat smart pointer for simple cases.
Of the SCL ones, std::auto_ptr would be perfect if it wasn’t for the fact that it is useful for very little else. I have seen junior engineers copy its usage because it was used in this case and get it VERY wrong. Simply having it in the code can be dangerous if the uninitiated (most C++ programmers sadly) copy it.
To Michael Fitzpatrick:
The same const modifier may be applied to pointer. There is indeed a subtle and somewhat confusing difference. If you write
void ptr_method(const class SomeClass* c)
you may not modify the object:
c->ChangeMe(); //illegal
c = someOtherPointer; //OK
But
void ptr_method(class SomeClass* const c)
means that you may not modify the pointer:
c->ChangeMe(); //OK
c = someOtherPointer; //illegal
+1 in "not weird" camp.
I’d say that "Dereferencing an abstract object" question shouldn’t even be asked. If it’s abstract, and *p was assigned to an actual object, a constructor must be called, at which point compiler will bark at abstract members. If *p goes to a reference, it shouldn’t matter, to a well-versed C++-er, if it’s abstract.
Are you underestimating your audience, huh? I am collectively hurt! ;-)
The token "0" is null, but it’s not necessarily a representation of address zero. I don’t think the language even defines how absolute address values relate to pointers at all. I’m not sure why Anony Moose is talking about address zero though.
"Dangling pointer vs. dangling reference <=> potato vs. potato."
Yes, exactly. If you thought a reference is just another name for the object, you may not see that.
"Are you saying that someone will think that holding a reference will make underlying object alive?"
If they need a crutch to understand references because they don’t get pointers, I would not be surprised to see that happening.
Csaboka: It’s not just Bjarne that thinks that way. The C virtual machine (yes, there is one, it’s just *very* similar to the underlying hardware most of the time) specifies that an all-bits-zero pointer is equivalent to NULL. NULL in C is *ALWAYS* zero.
See the comp.lang.c FAQ, questions 5.5 and 5.13 (and others in section 5):
http://c-faq.com/null/machnon0.html
http://c-faq.com/null/varieties.html
A "null pointer constant" is any constant integral expression that evaluates to 0. Any null pointer constant when converted (except via reinterpret_cast) to a pointer type yields the "null pointer value" of that type (i.e. what most people usually mean when they say NULL).
What Anony Moose is talking about is the object representation of the pointer (which roughly means what address the pointer points to). The null pointer value isn’t guaranteed to point to address 0. So:
reinterpret_cast<T*>(0)
can do anything, but:
static_cast<T*>(0)
is guaranteed to evaluate to the null pointer value of type T* (and yes, the null pointer value for each pointer type isn’t guaranteed to point to the same address either).
Note: before you tell me I’m wrong because the standard explicitly says reinterpret_cast<T*>(0) results in the null pointer value, that was an obvious defect: http://www.open-std.org/JTC1/sc22/wg21/docs/cwg_defects.html#463
Tuesday, March 27, 2007 9:27 AM by BryanK
> The C virtual machine (yes, there is one,
> it’s just *very* similar to the underlying
> hardware most of the time) specifies that an
> all-bits-zero pointer is equivalent to NULL.
> NULL in C is *ALWAYS* zero.
Wrong. BryanK, see AND READ the comp.lang.c FAQ, questions 5.5 and 5.13, particularly 5.13:
http://c-faq.com/null/machnon0.html
http://c-faq.com/null/varieties.html
0 in a source program’s syntax turns into a null pointer constant at compile time, which can turn into null pointers as needed at compile time. The representations of null pointers in the execution environment don’t have to be all-bits-zero.
In fact some computer architectures (e.g. Intel) can provide hardware assistance to debug some fraction of unintended attempts to dereference null pointers (e.g. a scalar object of length 2 or more bytes) if a null pointer is represented by all-bits-one.
For practical purposes this fight was lost long ago, because only antisocial weirdo thermonuclear geeks were willing to learn that a 0 in syntax didn’t have to be all-bits-zero at execution time.
By the way the title of this thread would have been better as "Passing by pointer versus passing by reference, a puzzle". For practical purposes both pointers and references are "usually" addresses, but from the language’s point of view either or both could be represented differently.
Another difference also arises from a choice to use a pointer vs. a reference. In some cases use of a reference will automatically convert some argument to a temporary and use the temporary, but use of a pointer won’t.
Aw, crap. s/whose values is/whose value is/ — I hate it when I rewrite a sentence but don’t fix it properly.
You probably got thrown off by the "all-bits-zero" part, and yes, that was poorly worded. I should have said "a pointer whose values is the constant zero is equivalent to NULL."
The rest is right though — NULL in C (and by extension, C++) is *ALWAYS* zero. A NULL pointer will always compare equal to the constant zero (because in comparison context, the compiler can tell what kind of pointer it needs to use, so it can generate code that uses a nonzero bit pattern if it needs to), and assigning the constant value zero to a pointer will set its bits to whatever the real hardware uses for null pointers.
Wednesday, March 28, 2007 8:20 AM by BryanK
> I should have said "a pointer whose values is
> the constant zero is equivalent to NULL."
The C and C++ standards only make that guarantee for certain specified forms and only when they’re known to be constant at compile time.
For example:
int *pf() = NULL; // OK
int *pf() = 0; // OK
int *pf() = 3.5 – 3.5; // compilers can decide
int *pf() = (void*) 0; // OK
int *pf() = (void*)(void*) 0; // prohibited
> NULL in C (and by extension, C++) is *ALWAYS* zero.
If you mean at execution time, it is *NOT ALWAYS* (except when speaking practically as mentioned earlier, because of so many broken programs that have to be catered to). If you mean at compile time, then NULL is always something that the implementor knew would be equivalent to a compile-time zero of some sort, but that says nothing about its execution-time representation.
No, I didn’t mean at execution time, that was basically what I was trying to say in my previous post.
A pointer variable which is currently holding a null pointer will always compare equal to an unadorned constant zero, because the constant zero is interpreted in a pointer context (because it’s being compared to a pointer). But the actual bits of the pointer variable may not all be zero (if you print out an expression like *((int *)(&ptr)), or maybe even (int)ptr, you may not get zero).
So in that sense, you’re right, it’s not "always" zero. But if the programmer compares the pointer-variable-containing-a-null-pointer to a constant zero, the comparison will always succeed.
(I’m not sure why the "(void *)(void *) 0" expression is prohibited: Is it just because you’re doing a double-cast to the same type?)
Thursday, March 29, 2007 8:19 AM by BryanK
> (I’m not sure why the "(void *)(void *) 0"
> expression is prohibited: Is it just because
> you’re doing a double-cast to the same type?)
The redundant cast is perfectly legal. The standard doesn’t even mention the redundancy, there’s no problem with that.
The result is a value which is a null pointer, which is a constant, and which has type (void*). But that isn’t enough for the hypothetical usage which I put it to. My examples require null pointer constants. A null pointer constant has some magic features besides simply being a null pointer and constant.
For comparison again:
int *pi = (void*)(void*) 0; // legal in C
int *pf() = (void*)(void*) 0; // illegal in C
// (both are illegal in C++, I think)
Around 10 years ago I posted in comp.std.c about (void*)(void*) 0 not being a null pointer. dmr posted a followup saying he was planning to write something about nasal daemons, but he double-checked the standard before writing and he agreed.
Friday, March 30, 2007 4:49 AM by Norman Diamond
You know you’re mad when you start talking to yourself. Well let me assure you, you were right to be mad.
Thursday, March 29, 2007 10:25 PM by Norman Diamond
> Around 10 years ago I posted in comp.std.c
> about (void*)(void*) 0 not being a null
> pointer.
Idiot. You just finished explaining the difference between a value which only happens to be a null pointer and a constant, and a value which has the additional magic property of being a null pointer constant. And then here you screwed it up already.
Now get this. (void*)(void*) 0 IS a _null_pointer_. And it’s constant. What it isn’t, is that it isn’t a _null_pointer_constant_. That’s why dmr agreed with you.
Now we can only wonder why no one else tore you to shreds on this before I did. You sure deserve it.