Date: | May 27, 2005 / year-entry #131 |
Tags: | code |
Orig Link: | https://blogs.msdn.microsoft.com/oldnewthing/20050527-25/?p=35523 |
Comments: | 25 |
Summary: | Everyone "knows" that the following pairs of expressions are equivalent: x*2 ≡ x<<1 x/2 ≡ x>>1 Too bad they aren't. In the C language standard, there is no requirement that the internal representation of signed integers be two's complement. All the permissible representations agree for positive numbers, but negative numbers can have different representations. If... |
Everyone "knows" that the following pairs of expressions are equivalent:
Too bad they aren't. In the C language standard, there is no requirement that the internal representation of signed integers be two's complement. All the permissible representations agree for positive numbers, but negative numbers can have different representations. If However, Win32 requires a two's complement machine, in which case the first equivalence Of course, the compiler is free to recognize this and rewrite your multiplication or shift operation. In fact, it is very likely to do this, because As for the second so-called equivalence, the C language specification originally did not specify whether division of a negative number by a positive number rounds towards or away from zero, but in 1999, the specification was revised to require rounding towards zero. Furthermore, the result of a right-shift of a negative value is unspecified, so the expression Even if you assume that the shift fills with the sign bit, The result of the shift and the divide are different if
The moral of the story is to write what you mean. If you want to divide by two, then write " |
Comments (25)
Comments are closed. |
"The moral of the story is to write what you mean. If you want to divide by two, then write "/2", not ">>1". "
Unless you are concerned about perfomance. I can’t recall the URL now but I read an article on MSDN which states the bit shifting is always faster than division/multiplication on the .NET framework and backed it up with stats.
If you are going to be dealing only with unsigned types and for eg you are writing a chess program using bitboards, I figure you would prefer to write ">>1" and not "/2"
KC, I’d like to see that URL. I’d imagine that in general, bitshifting would be faster, but I have difficulty believing that they are any different when multiplying or dividing by constant powers of 2, as this is child’s play for the optimizer.
I’d imagine that this would be easy for lightweight JITers like EconoJIT as well.
I’ve been telling people this for years. Especially back when VB.Net had just come out – people were swearing up and down that they /needed/ to have shift operators so they could divide quickly.
At the time, VB.Net had /only/ signed integer types. Explanations that shifting and dividing were not and could not be the same operation in that language fell largely on deaf ears.
KC – if you write a multiplication by power of two and aren’t inhibiting optimizations (for example, running under the debugger), the JIT can and does optimize the multiplication into a bit shift (or an add).
Turning division into a bitshift is a little trickier, but the optimizer does it – it’s basically a bitshift with some extra code to fix up the rounding if the dividend is negative. Not quite as fast as a raw bitshift, but you get the right answer, so that’s always good.
KC – if you write a multiplication by power of two and aren’t inhibiting optimizations (for example, running under the debugger), the JIT can and does optimize the multiplication into a bit shift (or an add).
Turning division into a bitshift is a little trickier, but the optimizer does it – it’s basically a bitshift with some extra code to fix up the rounding if the dividend is negative. Not quite as fast as a raw bitshift, but you get the right answer, so that’s always good.
KC – if you write a multiplication by power of two and aren’t inhibiting optimizations (for example, running under the debugger), the JIT can and does optimize the multiplication into a bit shift (or an add).
Turning division into a bitshift is a little trickier, but the optimizer does it – it’s basically a bitshift with some extra code to fix up the rounding if the dividend is negative. Not quite as fast as a raw bitshift, but you get the right answer, so that’s always good.
Nate, I found the URL. Here it is:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dndotnet/html/fastmanagedcode.asp
Have a gander at Table 1.
I didn’t know the compiler will optimize the code to a bit shift if you are dividing by powers of two. Well, you learn something new everyday.
KC – that table does show that shifting is faster than multiplication (though not nearly by as large a margin as many people think). However, that table is comparing multiplcation which actually gets compiled as multiplication with shifting.
When the JIT recognizes a constant multiplication and compiles it as a shift, it would no longer fall under multiplication in that table.
Even the compiler I wrote for my compiler design class in college supported constant folding. It’s pretty easy to implement (although if you combine constant folding and common subexpression elimination it can get hairy).
I’d be surprised if any compiler didn’t support it these days (except for naive implementations of PCC (assuming that PCC’s still around)).
Oh, and KC, that article you’re quoting just says that shift is faster than divide.
You’re assuming that you’re smarter than the compiler when you make statements like that.
And you’re not. There are TIMES when a human can write better code than a compiler can. But in these days, I’m not so sure about it – the rules for writing truly efficient code are sufficiently squirrly that hand coding in assembly is almost ALWAYS a performance loss.
The days when mere mortals were able to keep all the ramifications of code in memory are long gone.
If you thought it was hard keeping track of what values were in 6 general purpose registers, consider what happens when you have 128…
One rule of optimization is that if you could possibly write a reasonable Perl script to make the optimization on the source code, chances are that the compiler already makes that optimization behind your back.
Changing ‘x *= 2’ to ‘x <<= 1’ certainly falls in that category.
MSVC also does a neat optimization trick when dividing by a constant. E.g. "a/10" compiles into following ("a" is a signed int):
mov ecx, DWORD PTR _a$[esp+4]
mov eax, 66666667h
imul ecx
sar edx, 2
mov eax, edx
shr eax, 1Fh
add edx, eax
For unsigned case it’s a bit simpler:
mov eax, 0CCCCCCCDh
mul DWORD PTR _a$[esp+4]
shr edx, 3
The trick here is that a/b=(2^N/b)*(a/2^N). As b is constant, the first part can be folded. And the second part is replaced by shifts.
roxfan – the more general explanation for what they are doing there is they are basically doing a fixed point multiplication of "a" by "1/b". 2^N is "1" in M.N fixed point, so "2^N/b" is "1/b". Multiply by "a", and then the "/2^N" converts from fixed point back to integer by dropping the N fractional bits from the result (rounding down).
I have to wonder why they don’t use 32.32 fixed point, as they could completely eliminate the shift (the top 32 bits of a multiplication are just "edx") – there must be rounding effects that prevent 32.32 fixed point from being accurate in all cases.
(If you are going to be dealing only with unsigned types and for eg you are writing a chess program using bitboards, I figure you would prefer to write ">>1" and not "/2")
Exactly. In that case, you *mean* to shift bits arounds, not divide. Whoever heard of dividing chess games?
KC: You should nevertheless "write what you mean" until you discover that really THAT instruction make any impact on the performance of your application, which is almost certainly *never* going to happen.
Actually, the standard says nothing about left shifting a signed integer (some machines trap under certain conditions) so you can argue whether left shifting a signed integer is actually undefined or not. Right shifting a signed integer is implementation defined, not unspecified (implementation defined is the same thing as unspecified except they’re required to document what happens) so a compiler writer may or may not decide to implement it using sign extension even on 2s complement. So, my suggestion is to never use bit operations on signed integers (and I’ll even extend this to requiring both operands being unsigned so you don’t trigger a promotion to a signed integer type under certain conditions).
err I meant right shifting a negative value is implementation defined.
Let’s not go too far in disparaging the effectiveness of hand optimization in assembly.
In fact, speaking as someone who does this kind of thing, one of the reasons to avoid these premature optimizations is the possiblity that it will be assembly optimized in the future.
Whoever does that work is going to want you to use the more descriptive option, because it will provide them with a better understanding of the code. This high level of understanding is what allows programmers to write better code than compilers.
I think the real moral of the story is: "Being shifty is not the same as being divisive." (Sorry to bring up the last election.)
I sense that the AutoCad programmers that didn’t even *check* the icon for their Add/Remove entry probably berated their junior programmers for not replacing small power-of-2 divides by shifts. My 3GHz CPU divides a lot better than it displays non-existent icons.
> If x is negative, then x*2 and x<<1 are
> quite different on a sign/magnitude system.
I don’t see it. The closest thing I can think of is that a broken compiler generates a machine instruction for logical (or unsigned) shift left instead of arithmetic (or signed) shift left, but that would break programs that had valid code to do arithmetic left shifting of positive values. The second closest thing I can think of is overflow, where the undefined results of one overflow don’t have to match the undefined results of a different overflow.
Did you mean that x*2 and x<<1 have different results on a one’s complement machine? In that case I’d have to study the standard for a while. I’m not quite inclined to waste time on that though, because the 1989 standard was already broken in its handling of one’s complement and last I saw the 1999 standard didn’t fix it. The %c format specifier writes or reads an object of any character type in a way where the library cannot figure out which character type it was, so the library cannot figure out whether or not the value was a one’s complement negative value that needs an adjustment to be applied.
Regarding comments on writing assembly code, there are essentially two situations where it should be done. One is when there isn’t a compiler that will produce the necessary code (e.g. when a human reads a manual and figures out which subset of the registers should be saved at the beginning of an interrupt handler). The other is when the compiler generated correct code but you can look at it and see a necessary improvement.
As an example of the latter case, for a garbage collection routine I had to do a quicksort before moving live objects, and it was much better to monopolize the CPU for 0.25 seconds instead of 1.00 seconds. This was on an industrial system which wasn’t going to respond to external events during that time. (Sure there were other ways of avoiding monopolizing the CPU, equivalent to a Visual Basic program calling DoEvents in a loop. Then the thing would have spread its 1.00 second of CPU time across an elapsed time of 20 minutes more or less, which also wasn’t acceptable.)
> If x is negative, then x*2 and x<<1 are
> quite different on a sign/magnitude system.
: I don’t see it.
Consider x = -1
x = 0x80000001
x << 1 = 0x00000002 = +2
but x * 2 = -2 = 0x80000002.
The C standard states (section 6.5.7) that the result of "x << y" is undefined if x < 0. The compiler is not broken.