Why do operations on “byte” result in “int”?

 Date: March 10, 2004 / year-entry #92 Tags: history Orig Link: https://blogs.msdn.microsoft.com/oldnewthing/20040310-00/?p=40323 Comments: 38 Summary: (The following discussion applies equally to C/C++/C#, so I'll use C#, since I talk about it so rarely.) People complain that the following code elicits a warning: byte b = 32; byte c = ~b; // error CS0029: Cannot implicitly convert type 'int' to 'byte' "The result of an operation on 'byte' should be another...

 (The following discussion applies equally to C/C++/C#, so I'll use C#, since I talk about it so rarely.) People complain that the following code elicits a warning: ```byte b = 32; byte c = ~b; // error CS0029: Cannot implicitly convert type 'int' to 'byte' ``` "The result of an operation on 'byte' should be another 'byte', not an 'int'," they claim. Be careful what you ask for. You might not like it. Suppose we lived in a fantasy world where operations on 'byte' resulted in 'byte'. ```byte b = 32; byte c = 240; int i = b + c; // what is i? ``` In this fantasy world, the value of i would be 16! Why? Because the two operands to the + operator are both bytes, so the sum "b+c" is computed as a byte, which results in 16 due to integer overflow. (And, as I noted earlier, integer overflow is the new security attack vector.) Similarly, ```int j = -b; ``` would result in j having the value 224 and not -32, for the same reason. Is that really what you want? Consider the following more subtle scenario: ```struct Results { byte Wins; byte Games; }; bool WinningAverage(Results captain, Results cocaptain) { return (captain.Wins + cocaptain.Wins) >= (captain.Games + cocaptain.Games) / 2; } ``` In our imaginary world, this code would return incorrect results once the total number of games played exceeded 255. To fix it, you would have to insert annoying int casts. ``` return ((int)captain.Wins + cocaptain.Wins) >= ((int)captain.Games + cocaptain.Games) / 2; ``` So no matter how you slice it, you're going to have to insert annoying casts. May as well have the language err on the side of safety (forcing you to insert the casts where you know that overflow is not an issue) than to err on the side of silence (where you may not notice the missing casts until your Payroll department asks you why their books don't add up at the end of the month).

 Comments (38) Cooney says: >int j = -b; > >would result in j having the value 224 and not >-32, for the same reason. > >Is that really what you want? YES! Oh god yes! I so hate dealing with the abomination that is signed bytes in Java that I could spit. Having well-behaved bytes is a godsend. >struct Results { > byte Wins; > byte Games; >}; Bytes are for screwing around with bytes, not basketball scores. Use shorts if you care. Raymond Chen says: Okay, then how about this: struct Packet { byte Size; … }; int SizeRequired(Packet packet1, Packet packet2, Packet3 packet3) { return packet1.Size + packet2.Size + packet3.Size; } If you use this to compute the amount of memory required to store three packets, you just earned yourself a buffer overflow. Matt says: Um… if I declare my variables as bytes, then yes, I would expect overflow when performing arithmetic that uses the ninth (or sign) bit. I don’t expect the C compiler to protect me from my own dumb mistakes. This seems very "anti-C-ish" to me. But as always in Raymond’s posts, very interesting. John Brown says: I never expected, or even wanted, a compiler to protect me from my own dumb mistakes. What, me worry?! I thought I had it all figured out. Then I learned my lesson. Perhaps you can benefit from my experience. I made a dumb mistake. The C compiler didn’t complain. It figured I knew what I was doing — or more likely that if I was a fool, I deserved what I got. The Java compiler voiced its concerns: "She’s gonna turn into her mother, pal. You’ll live to regret this. You know what they say about buying a cow, don’t you? And speaking of cows, LOOK AT HER MOTHER!" But I didn’t listen. I just didn’t listen. Serge Wautier says: I completely second Matt. > int j = -b; If you end up negating an unsigned value, i’d say that error is more probably in the design than in the code. > byte c = ~b; OK, so what’s the problem keeping a byte here ? > return packet1.Size + packet2.Size + packet3.Size; > <…> you just earned yourself a buffer overflow. So did you if Packet.Size is an int and the sizes are large enough. Less likely to happen ? Maybe. But these rules were designed in the C language at a time where most int were 16 bits. int overflow in arithmetic was way more usual at that time. I discovered this rule (how is named ? integer promotion ?) some day when I decided that I wanted a new piece of code at to compile at warning level 4 with no warnings allowed (warnings = errors). Adding to embed nasty casts in basic statements was horrible : Code less readable, BIG comment required in each such statement to explain why the cast are there and must stay there,… Eyal says: Well, i live in the fantasy world of Delphi and it seems to work ok there: procedure Test; var a,b,c: Byte; I: Integer; begin b := 32; c := 240; i := b+c; // i= 272 a := b+c; // a= 16 end; if there’s no explicit type then int will be used i.e: if (a+b)>200 // will return True Raymond Chen says: This appears to be a different fantasy world, one where int-to-byte truncation does not raise an error. Jerry Pisk says: Raymond, by your logic int should automatically be promoted to long (64-bit int) if the calculation would result in an overflow. I must absolutely agree with everybody, if I use a byte then I want the compiler to work with byte. Period. Raymond Chen says: Okay well it looks like I lose this round. Mike Dimmick says: C and C++ promote to int (the ‘natural’ size of the platform) because it’s more efficient to compute in integers. C++ has a slight difference from C in that the type of a character literal (e.g. ‘a’) is char – in C, it’s int. Microsoft’s 64-bit C/C++ compilers break this ‘natural size’ recommendation: long and int are only 32 bits. Frankly, using a byte rather than an int is pointless – you won’t save any memory due to the compiler’s packing anyway. If you alter the packing, you may get masking and shifting operations under the covers – and you must remember to align your data appropriately, or you’ll get unaligned accesses (which is slower or raises exceptions on some processors). B.Y. says: I think the rule should be: promote integer values to the highest integer level in the expression. So: byte a,b; int i; a=~b; //no warning i=a-b; //no warning b=i; //warning I thought this is what C++ does. Alex Pavloff says: I made myself a little integer promotion table some years back. MSDN has the complete set of rules: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vclang/html/_pluslang_arithmetic_conversions.asp I’m a C++ programmer. I can’t tell you how annoyed I get when I’ve got to use VB for any operation that requires some precision in the size and behavior of integers. And while we’re here, can someone PLEASE tell me why VB _still_ doesn’t support Signed Bytes, Unsigned Words, and Unsigned Double Words? Jonathan Payne says: I was a little surprised by the warnings this code generated: typedef char BYTE; BYTE a = 0; BYTE b = 0; a += b; // warning C4244: ‘+=’ : conversion from ‘int’ to ‘BYTE’ a = a + b; // no warning Why does ‘+=’ differ from ‘+’ and ‘=’ for simple types? If anything, I would expect the second statement to promote ‘a + b’ to an int and then generate a warning at the assign and the first statement to compile OK as a and b are the same type and the statement would resolve to the internal equivalent of ‘char operator+=(char)’. Nathan Downey says: I actually don’t mind this all that much. It doesn’t prevent you from doing anything, it just forces you to be explicit about your intentions. Now, should it be the job of the compiler or runtime to enforce explicitness when it encounters suspect code? I think it depends on the situation. C# has saved my ass a couple of times…. take this trivial example: int one = 1; if (one = 1) { return; } COMPILE ERROR: C:sandboxoutRefTestClass1.cs(18): Cannot implicitly convert type ‘int’ to ‘bool’ You could make the argument that a *good* developer would use something like this: if (1 == one) { return; } All your doing here is making the expression explicit. Stating your intentions. Even good developers make mistakes, and mediocre developer make lots of them. I don’t see anything wrong with the compiler asking the developer. "Are you sure you want to commit this act of stupidy?" Richard Tallent says: Overloading the addition operator "+" with an append/concatenate operation is wrong. Addition operations are commutative, append is not. VB.NET solves this by having a proper concatenate operator ("&"), but the C* world is left with ambiguity. Maks Verver says: If implicit conversion is so great, why does it only work with operands of a smaller size than int? The following example is ‘broken’: int x = 2147483647, y = 2147483647; long long int z = x + y; z becomes -2 instead of 4294967294 and a cast on x is required to make it work as intended. It’s nice that you can save yourself a few casts here and there with the implicit conversion to int and it probably does prevent a few bugs you wouldn’t have thought about, but the implicit conversion is just a partial fix to a much broader (potential) problem. After all, even something as simple and common as an integer addition possibly overflows. Surely, you can’t expect the compiler (and definitely not a C compiler) to ensure that all result types are large enough to completely avoid overflow, as this would incur a significant performance overhead on integral types that exceed the ‘native’ size of integers (usually 32 or 64 bits). sd says: "Even good developers make mistakes, and mediocre developer make lots of them. I don’t see anything wrong with the compiler asking the developer." The problem is that its inconsistant with other examples in the language. Serge Wautier pointed out that the same problems can occur if you sum two ints. The compiler doesn’t complain about that though, does it? Sean Terry says: I had posed a question about this behaviour to Eric Gunnerson a few weeks ago and I felt the same way many people here do… that I want a byte to act like a byte. http://blogs.msdn.com/ericgu/archive/2004/02/02/66345.aspx I have "accepted" this kind of behaviour, and have moved all my bitflags to ints instead of bytes so the sake of not polluting the readability of my code with gobs of explicit conversions. Is there a happy medium here? As a developer who on occasion needs to perform some good-ole-fashioned binary manipulation and just drop the frigging 9th/17th/33rd bit… I would like a compiler option to allow such implicit conversions. I’ve had to explain this to co-workers four times in the last few weeks. Them: Why are you using an int for that? Just use a byte. Me: The C# compiler won’t let me. Them: *boggle* Raymond Chen says: And other people would say, "Why are C#’s integer promotion rules different from C and C++ on such a fundamental issue?" Basically no matter what decision you make, somebody will be upset. JCAB says: I agree you have made a (obviously debatable) point with the arithmetic operations. Nevertheless, bitwise operations (like your initial example of bit-negation) are not causes of overflow, so I don’t see a reason at all, except to keep all operators working in exactly the same manner, no matter how unrelated to each other they are. Raymond Chen says: True, but imagine the confusion if you make bit operators have different promotion rules from arithmetic operations! You would have byte b; b & 0x1F; // result is "byte"? b % 0x20; // result is "int"? b << 1; // result is "byte"? b * 2; // result is "int"? Dmitriy Zaslavskiy says: Raymond, I think one the original reason for promotions was that not all hardware can perform ops on values smaller that "natural size" And the designers didn’t want to hide that fact. Just a speculation Cooney says: bookending the discussion ;) Raymond sez: >Okay, then how about this: > >struct Packet { >byte Size; >… >}; If the packets are known structs, then I’ll use defines. If they’re datapackets, I’ll use shorts and write network pack/unpack routines for each packet. Bytes are bytes and ints are ints – they should be separate. Factory says: Hmm I thought that the C’s desire to use int as much as possible was from it’s parent language of BCPL, in which all variables were of the same size. Wilhelm Svenselius says: Actually, there’s is a language in the "C* world" (unless you interpret that to only mean languages that begin with C – I interpret it as languages in the same "family") which has a proper operator for appending strings – PHP with operator ‘.’ (period). Very nice and useful. \$some_var = "Hello ".\$name.", how are you today?"; Then again, in PHP you can just do \$some_var = "Hello \$name, how are you today?"; but I try to avoid this form, it makes code harder to read IMO, and it’s not practical for all cases. project says: >In this fantasy world, the value of i would be 16! I’d like to be in this world! Nobody claims that in this code d equals to 1: int a=3; int b=2; double d=a/b; >Similarly, >int j = -b; >would result in j having the value 224 >and not -32, for the same reason. If b is unsigned, 224 seems quite reasonable. >Is that really what you want? YES projects says: Sorry. I meant "nobody _complains_ that d equals to 1" Andrew Shuttlewood says: Not that I disagree with the design per-se, but isn’t this rather an interesting demonstration of inconsistency? int a,b,c; a = Integer.MAX_VALUE; b = Integer.MAX_VALUE; c = a + b; Why isn’t the result of a + b a long? and similarly upwards? Raymond Chen says: Okay already – I get the point. You folks want C# to have different integer promotion rules from C and C++. I personally think that’s nuts – if you’re designing a language, you don’t want to make it gratuitously different from the language people are coming from. But then again, it looks like I’m in the minority. Andrew Shuttlewood says: Actually, I think it would be more interesting if C# fixed the consistency the other way maybe. As you say, integer overflow is the current attack vector of choice, yet in Java (not really played with C# yet, I don’t see how I check if an int has overflowed. Or a long, or ….). (obviously you could cludge a check). Maybe as a community we should bite the bullet and use more complex types to represent integers in order to prevent integer overflow issues. Java made the decision to eliminate unsigned types, and although done for other reasons, it almost seems prescient – how many bugs or casts are there when people are casting to unsigned types. The flaw discovered in IE5 with BMPs was exactly the result of casting a signed integer to an unsigned one. By preventing the issue, it protects you somewhat. Anyway, I dunno, I imagine it would be hard to guarantee the type of safety that people want and still get the sort of performance that they want, but it would be interesting to try. MilesArcher says: Raymond, Then why did they make VBdotnet gratuitously different from VB 6? Raymond Chen says: You’ll have to ask them. Florian says: In my fantasy world, when writing byte b = 32; byte c = 240; int i = b + c; then i would be 272 and int j = -b; j would be -32. But byte a = ~b; would still not be an error. Why? Because byte a = b + c; would result in a being 16 indeed. In my fantasy world the compiler would take the time to look at the type that the result is expected to be and do the computation accordingly. If I want a byte result when adding bytes I don’t need no integer promotion. If the result is an int, then yes, I want the integer promotion. My choice as a programmer. And in the cases where the compiler can’t know what type I expected the result to be (as in: if ( a+b > 250 ) ) then I want to get a warning or an error which will save me from my possible mistake. Raymond Chen says: (Note that the style for C-family languages is that the type of an expression is not context-sensitive.) Norman Diamond says: Traditionally, if one wanted to get correct results without suffering from overflow, one used a language that had bignums. If one wanted to get performance at reasonable (such as it was then) speed or cost, then one used a language somewhat closer to assembly language and wrote additional code only when deemed necessary to check for overflows. Traditionally languages with bignums were unpopular because of their expense, both in the bignums themselves and in their dependence on frivolities such as garbage collection, and because of their ugly syntax. When object orientation was invented, some people felt that its syntax should look like Smalltalk so people reading it could be sure that they weren’t reading Fortran or Lisp or C. But then lo and behold, someone gradually built up an almost object oriented language by bits and pieces, with a syntax based on C. (And someone else did the same to Lisp.) C# has syntax less ugly than Lisp, while it does have garbage collection. Starting from this base, it would have been useful to make it a higher level language in other ways as well, such as avoiding overflow and using bignum as a standard type. By being the mixture that it is, C# is still a language, it just isn’t serving any really useful purpose. Peter Evans says: I have to agree that "byte" operations on "byte" declared operands should result in either "byte" type or throw typical number system errors such as overflow or underflow from C# or other CLS compliant compilers. IMO C# was designed for making code express conversions explicitly and for having its type system enforce consistency. I find the arbitrary promotion rules in C/C++ cumbersome for today’s programmer who may never have to write device drivers or highly efficient bit codecs. Yes, new C# application programs can be blind to some of this low level understanding, but why force them to think in these low-level terms. I believe C# has an optimal performance with its strict type system for its intended audience. However, I am expecting the upcoming CLI/C++ mapping to fill the need Ramond desires for the traditional type promotion rules intermixed with a compliant compilation. asdf says: I think C# should of used saturated arithmetic with no promotion of types (unless both sides of the operator are different) but it’s way too late now. Raymond Chen says: Commenting on this entry has been closed. Comments are closed.

*DISCLAIMER: I DO NOT OWN THIS CONTENT. If you are the owner and would like it removed, please contact me. The content herein is an archived reproduction of entries from Raymond Chen's "Old New Thing" Blog (most recent link is here). It may have slight formatting modifications for consistency and to improve readability.

WHY DID I DUPLICATE THIS CONTENT HERE? Let me first say this site has never had anything to sell and has never shown ads of any kind. I have nothing monetarily to gain by duplicating content here. Because I had made my own local copy of this content throughout the years, for ease of using tools like grep, I decided to put it online after I discovered some of the original content previously and publicly available, had disappeared approximately early to mid 2019. At the same time, I present the content in an easily accessible theme-agnostic way.

The information provided by Raymond's blog is, for all practical purposes, more authoritative on Windows Development than Microsoft's own MSDN documentation and should be considered supplemental reading to that documentation. The wealth of missing details provided by this blog that Microsoft could not or did not document about Windows over the years is vital enough, many would agree an online "backup" of these details is a necessary endeavor. Specifics include:
• A "redesign" after 2019 erased thousands of user's comments from previous years. As many have stated, the comments are nearly as important as the postings themselves. The archived copies of the postings contained here retain the original comments.
• The blog has changed domains many times and the urls have otherwise been under constant change since 2003. Even when proper redirection has been set up for those links, redirection only works for a limited period of time. For example, all of the internal blog links that were valid in early 2019, were broken by 2020 without proper redirection.
• The blog has been under constant re-design and re-theming since its inception. It is downright irritating to deal with a bogged-down site experience as the result of the latest visual themes designed for cell-phone browsers. As of this writing, it is cumbersome to navigate titles with only 10 entries per page. While it is nice that the official site has a search feature, searching using this index (with all titles on a single page) is much quicker (CTRL-F in most browsers).

<-- Back to Old New Thing Archive Index