Date: | August 10, 2009 / year-entry #249 |
Tags: | code |
Orig Link: | https://blogs.msdn.microsoft.com/oldnewthing/20090810-00/?p=17163 |
Comments: | 19 |
Summary: | Welcome to CLR Week 2009. As always, we start with a warm-up. The String.Format method doesn't throw a FormatException if you pass too many parameters, but it does if you pass too few. Why the asymmetry? Well, this is the type of asymmetry you see in the world a lot. You need a ticket for... |
Welcome to CLR Week 2009. As always, we start with a warm-up.
The
Well,
this is the type of asymmetry you see in the world a lot.
You need a ticket for each person that attends a concert.
If you have too few tickets, they won't let you in.
If you have too many,
well, that's a bit wasteful, but you can still get in;
the extras are ignored.
If you create an array with 10 elements and use only the first five,
nobody is going to raise an Besides, you probably don't want this to be an error: if (verbose) { format = "{0} is not {1} (because of {2})"; } else { format = "{0} not {1}"; } String.Format(format, "Zero", "One", "Two");
Think of the format string as a |
Comments (19)
Comments are closed. |
Can we supply any number of extra params to that function? Say 50 or 5k ?
Theoretical use less question :)
"Can we supply any number of extra params to that function? Say 50 or 5k ?"
That would made for a WTF-laden code, but I bet some "enterprisey" code somewhere does it, for something like, making a huge string from an array of strings (without using StringBuilder):
Pass format string created in #1 and array to string.Format.
Disgusting indeed, but I had to think of it.
@@Santosh, seems like there is a limit:
http://stackoverflow.com/questions/561020/string-format-parameters
I’m still testing this.
Arun
It definitely shouldn’t throw an exception, but since I’d guess that more than 90% of the time such a mismatch is an oversight by the programmer it might be nice if the C# compiler issued a warning if it happened to notice (maybe it already does – I should check).
Another place this might show up: localization
You might have localizable format string which might not use all of the parameters in a certain language for whatever reason. So you might do something like:
String.format(getLocalizedFormat(), arg1, arg2, arg3);
Actually, the same reasoning applies to any format string generated at runtime.
I would argue, though, that for any hard-coded format string, it should always be a compiler warning not to supply the exact number of arguments specified. Actually, it should be an error, but that adds some non-orthogonality to the language spec.
gcc does that for printf & co, at least by default.
The notion offends me that the compiler thinks it knows what some random function does with its arguments (though reading the fine print in the language spec probably gives it the right to do so, and in any case it’s not hard-wired). However, on the pragmatic side, it has saved me from silly mistakes now and then.
That’s why I love Resharper :) It will mark such unused parameters.
As for question about *any* number of parameters – I am pretty sure there is a limit. If you specify more than certain number of parameters they will not fit on the stack and format string will not fit in the memory ;)
"they will not fit on the stack"
string.Format’s arguments are passed as a single array, not as individual parameters on the stack. It’s just that C# has some syntatic sugar to make it LOOK LIKE they’re "normal" arguments. If you look at the definition of string.Format, it’s actually:
void Format(string format, params object[] args) { … }
(well, there’s other overloads, but that’s the important one).
Did someone actually ask this question? The answer should be self evident. It throws an exception if there are too few parameters (or more accurately, too few elements in the array) because it doesn’t know what to do if there’s no value for a particular element. If there are too many elements in the array, well, it knows what to do: Nothing. It doesn’t even check–why should it?
santosh: The limit should be a hardware limit rather than a software limit. However, it appears (IMPLEMENTATION DETAIL) that in order to prevent runaways the code stops at anything at or over 10,000,000. It’s actually using a StringBuilder internally, calling StringBuilder.AppendFormat(null, FormatString, ParamsArray).
When it hits an opening curly brace ("{") it checks to see if the next one is an opening curly brace and if so continues looking, but if not it stops and begins processing the number. It has a temporary number that starts at zero. First it multiplies the temporary by ten, then subtracts the character ‘0’ from the current character (if it is between ‘0’ and ‘9’, inclusive) and adds it to the temporary variable. It then moves on to the next char and checks to see if it is a number and the current number is less than 1,000,000. If it is, it repeats the previous pair of sentences. Otherwise it declares the number done and uses whatever it came up with.
In the case of a number that goes beyond 10,000,000, when it’s processing the formatting it realizes that there’s still a number where it was expecting a comma, colon, or closing bracket and throws an exception. (END IMPLEMENTATION DETAILS).
Keep in mind that you don’t have to actually pass params to String.Format (or StringBuilder.AppendFormat), you can instead pass an object (or any) array. You also don’t have to reference every single element. Just try:
object[] Temp = new object[100000000];
Temp[9999999] = 5;
Console.WriteLine(string.Format("{99999999}", Temp));
Worked lovely for me, but try adding 1 more…
The array is huge, by the way, so it takes a while to allocate, but it is possible to allocate it and therefore it is possible to pass parameters to string.Format beyond what it is programmed to handle.
I’ve used this (as well as skipping using some of the placeholders) intentionally quite a few times. It’s particularly handy when you want to look up a status/error message in a translation table using a common method, or if you have a bunch of alternate "slots" that info can be displayed.
Lets you just provide all the data that they might want to display and then whoever’s doing the UI/translation can pick and choose which bits they want where. :)
Just to add to the useless theoretical discussion, interpreting the ECMA 334 document for the C# specification… you could pass a maximum of approx. 2.84E+19 (2^64 – 1 to be exact) arguments.
In practice, the compiler would probably choke on it if you passed that many individual arguments to a parameter-array (as opposed to passing an array object.)
In theoretical physics there’s a rule of thumb saying (paraphrased) that if the answer to a question is infinity, then the question is probably invalid.
Applying this to C# programming, if you are passing over 18 sextillion†individual arguments to a function, you are probably doing something wrong. It may be legal in theory, but that doesn’t mean it’s right.
†US or short scale. Would be trillion on the long scale, or to most Europeans.
Dave,
You can always turn the thing off. You can actually configure gcc to check for printf-style arguments on *any* function by using the __attributes__((format(…))) construct.
"Did someone actually ask this question?"
That’s what I was wondering as well. "Why doesn’t my application crash on this perfectly valid call? Please make my application crash!"
Though perhaps someone had a formatting bug related to supplying a wrong set of arguments, and throwing on too many arguments would have caught the bug.
"But if you ask for DATE, then you have an error."
…sigh…so true in my life, happens to me constantly.
I don’t believe that passing too many arguments to printf should result in a compiler error. A warning is fine, even welcome, I would say, but it can still compile the function call, so it should just go ahead and do it. (I know, even if there are too few arguments with respect to the format string, it can still be compiled.)
Similarly, while trying to use an undeclared variable is an error, since the compiler has no storage or data type for it, declaring a variable but not using it may be an oversight on the programmer’s part, but there’s nothing to stop the compiler from generating code.
Problem is, how could it be a compile warning? The compiler doesn’t know what the formatting string does. The formatting string is handled by the function’s code. For all the compiler knows, the formatting string could look like "{1} {2}", or it could look like "%f %d", or it could look like "$1 $2", or it could look like "<variable pos="0"/> <variable pos="1"/>". The compiler doesn’t know, or really care. All it knows is that it’s received 3 parameters: a string, a float, and an integer. That fits the function signature of a string and an object array with a params attribute, so it uses that.
How’s the compiler going to emit a warning when you don’t happen to use all the parameters? It has no way of knowing if you did or not.
Only the runtime would know how the string is handled. How’s the runtime going to communicate a warning?
Erzengel – dave mentioned above that gcc does this for printf and similar functions, and I’ve seen it happen myself. It kind of surprised me when I first saw it.
I suppose the rules have been specified somewhere. And at compile time, they can only be applied with static strings. One way or another, it has been done.
Falcon, yes it HAS been done. But that doesn’t mean that it SHOULD be done.
The compiler’s job is to translate source code into some other format which in case of the .NET compilers is IL. What a compiler of a statically typed language like C# needs to do is to enforce strong typing.
The right tool for a job like this is static analysis via FxCop or CAT.NET.
BTW: FxCop has been integrated into the compiler in VS2008 and is extendible.
Don’t get me wrong – I agree that compilers shouldn’t delve too deeply into the semantics. For instance, detecting references to freed memory in C code is not the compiler’s job. There’s a risk of both false positives and negatives, and all the effort put into implementing these features could go to waste.
On the other hand, when you run a static analysis tool on the code, you are specifically asking for these kinds of checks to be done.