Date: | May 15, 2007 / year-entry #170 |
Tags: | other |
Orig Link: | https://blogs.msdn.microsoft.com/oldnewthing/20070515-00/?p=26863 |
Comments: | 27 |
Summary: | While it's true that one category of problems comes from failing to quote spaces in command lines, it is a false statement that "path names in the registry should have quotation marks around them for obvious reasons." In fact it's the opposite. Path names should not be quoted. Think about it: Quotation marks are a... |
While it's true that one category of problems comes from failing to quote spaces in command lines, it is a false statement that "path names in the registry should have quotation marks around them for obvious reasons." In fact it's the opposite. Path names should not be quoted. Think about it: Quotation marks are a parsing issue, not a file name issue. The quotation marks are not part of the file name. If you type Quotation marks indicate to code that is parsing a command line that a space should not be interpreted as the end of the command line token. Therefore, you need quotation marks when the string will be interpreted as a command line, such as in the In the example in the linked comment, the registry key in question is a list of paths to files, not a list of command lines. Since it's not a command line, quotation marks would result in an invalid file name, since quotation marks are not legal file name characters. The correct statement of the rule is that command lines in the registry should have quotation marks to protect spaces. Path names, on the other hand, not only do not require quotation marks, but in fact must not have quotation marks, because the quotation mark is not a part of the file name. Nitpicker's corner Observe that at no point did I claim that Microsoft employees are perfect. Microsoft employees make mistakes, too. |
Comments (27)
Comments are closed. |
This came up on the WiX mailing list recently, I think I may have said something similar.
Whether the string needs quoting if passed to CreateProcess depends on which argument the string represents! If it’s the first argument, lpApplicationName, it’s simply a path and shouldn’t be quoted; it should always be quoted if it’s the second argument, lpCommandLine. If passing both (recommended) you should still include the program in the command line (and quote it) so that argv[0] works.
It doesn’t do any harm to quote a path that doesn’t include spaces in a command line, so I always ensure that quote marks are used around any path.
Just thought I’d try and counter the recent nitpicking by saying thanks for keeping this blog going. I’ve found the programming related entries both interesting and very informative.
A win32 service I worked on and wrote the installer for had a customer reported issue about service start. The main part of the details were that the binary path in the serviceconfig->lpBinaryName (that corresponds to HKLMSystemCCSServices<service name>ImagePath) on win2000 server didn’t have quotes when the service was installed, and there were service start errors. Now I can’t recall the exact errors, but support reported "add quotes to the path and that fixes it".
I was investigating it, and saw in 2003 SR 1 quotes were added to the path when the service was installed (via CreateService() api) but 2003 base did not add quotes. So for a legacy fix, I added quotes as a post-install action (if quotes weren’t there already).
Alas, my search-fu in our bug tracker failed — I can’t find what the customer was doing to cause the problem (and dummy me didn’t reference the defect # in my check-in notes). Mighta been a command line "net start" command, unsure. But there was a problem with image paths not being quoted in the registry. (boy, I’m ashamed to be a nit-picker with shaky details..)
ImagePath entries are in fact command line, they’re getting passed to WinExec() (on NT4) or CreateProcess() as lpCommandLine (Win2K and beyond).
However, not adding the quotation marks does result in almost the correct behaviour. Almost as in: The service will start, but you’ll create some event entries and a potential security vulnerability.
Say your service is installed in "C:FooBar Bazmy service.exe", but you added it without quotation marks. Then CreateProcess() will first try to load C:FooBar.exe, then "C:FooBar Bazmy.exe", then "C:FooBar Bazmy service.exe". Since the user will not have access to C:FooBar Baz, but might be allowed to create Bar.exe in C:Foo…
Particularly VmWare’s products have been vulnerable to this issue, since the default installation name is "VmWare $ProductName" and even further inside "C:Program Files" and "C:Program FilesCommon Files", and the entry for the vmstor2 driver was typically added without quotation. On many installations, normal users were additionally given create-file&folder access in C:Program Files… now it’s trivial to notice that creating C:Program FilesCommon.exe will give you a wonderful privilege escalation…
That having said: Always use quotation marks for services.
Nathan: that’s not a path, it’s a command line. Many services are implemented inside an executable that, by default, does something else, and needs an extra argument (-ntservice for example) to act as a service. See also svchost
Anon and Hyperion. What Anon describes I think was the problem — if there was something in some directory with a similar name (like the example "c:program filescommon.exe") the service would fail to start. That rings a bell for my memory, and explains the problem.
Obviously it can’t be changed now, but this whole quoting issue wouldn’t be an issue if CreateProcess had taken an array of strings for the command line instead of a single string. (And CreateProcess was probably designed before spaces in paths were possible, too.)
Command lines in the registry could be handled with either REG_MULTI_SZ values (if args need to be added), or with a (set of) REG_EXPAND_SZ or REG_SZ values. (You’d use a set of values if you need to pass multiple args.)
But this is all just wishful thinking, since that design isn’t going to change anymore. :-)
Dear BryanK, please adhere to Raymond’s philosophy and don’t try fixing something that isn’t broken. It’s trivial to check whether such entries exist, correct them at once and send a bug reporter to the developers.
Dear Adam: Don’t think so complicated. Quoting is idempotent, thus quoting a quoted string isn’t bad at all. And you can normalize it as well.
.NET does exactly the same parsing as CreateProcess or earlier WinExec or ShellExecute.
Adam:
I am not sure, but I bet the framework uses the CommandLineToArgv API to split the command line string up. The rules for that API are documented in its reference, however there’s no guarantee that the framework actually uses it. You could probably find out for sure with any of ildasm, Reflector, or Rotor (if you want to accept the "shared source" license terms).
anonymous:
Huh? It obviously is broken: when a program needs to parse its command line, an OS needs to either (a) come up with rules for standard parsing and hope that every process follows them (and this fails on Windows, since not all processes handle quote-parsing the same way — although everything I can think of from Microsoft does), or (b) use an array of strings at both "ends" of the call. If the array had been used at both ends, then none of the quotes would have been needed.
As for "it’s trivial to check whether such entries exist": Check where? If in the registry, how am I supposed to know what in the registry is interpreted as a command line by being passed to CreateProcess (and therefore needs quotes) versus what’s interpreted as a path (and therefore can’t have quotes)? Especially when I don’t have source to any of the programs that use the registry?
And not all command lines come from the registry, either: some of them are hardcoded into dumb programs, and others are retrieved from the command line. If cmd.exe passes a quoted string to a child, but the child doesn’t interpret the quotes correctly, that’s impossible to fix (without the source for the child).
As for "file a bug report" — yeah, that works well when the company that wrote the program is long-since dead…
Anyway, as I said, my comment isn’t good for anything in reality, because CreateProcess is here, and it won’t (and shouldn’t!) change. I’m just saying that if the interface had been designed differently, none of this quoting stuff would have been an issue, because there would be no quotes. The boundaries of individual arguments would have been delimited by the end-of-string markers in the array.
Last anonymous: CreateProcess parses lpCommandLine only if lpApplicationName is NULL. ShellExecute[Ex] looks up the extension in the registry and expands the command line it finds there, then passes that to CreateProcess’s lpCommandLine argument; for the executable extensions .exe, .com the registry has "%1", so lpCommandLine is simply the path to the executable.
CreateProcess gives up command-line parsing once it has found an executable to run. The entire command line is passed to the executable – for a Windows application, in WinMain’s lpCmdLine parameter. There is no other entry point definition – wWinMain, main and _wmain are all figments of the C runtime’s imagination. It’s the CRT which parses the command line into argv/argc (which it actually does with GetCommandLine rather than using the supplied parameter). See parse_cmdline() in crtsrcstdargv.c.
Likewise, .NET is doing the parsing itself. Inspect a .NET executable with dumpbin: you’ll see that the entry point is actually in the import directory! It points to mscoree.dll’s _CorExeMain function. If you have the Shared Source CLI 2.0, you can find the parsing code in the SegmentCommandLine function in clrsrcutilcodeutil.cpp.
The specific parsing of its command line is up to the program, but if it doesn’t follow the CRT’s rules, users will get confused. There’s also a CommandLineToArgvW routine which appears to follow the same rules.
Didn’t see this when I started:
Don’t ask the program, thereby sidestepping the whole problem. :-P
Because both ends are .net in this example, the .net people could have come up with their own quoting rules (because they still use CreateProcess under the hood, so they need to quote), and they could have applied them in both Process.Start and (in reverse) in the framework code that calls Main and passes it the array of Strings.
Of course that fails when a non-.net program runs a .net program, or vice versa. Probably the only solution to that is to document the quoting rules, so people know what they’re getting either way.
And with a time machine, you wouldn’t have to ask the child process in the Win32 case either, because instead of passing one long string, you’d pass the whole array in the code behind CreateProcess. (You’d probably want to lay the strings out next to each other for efficiency, but you wouldn’t want to require this.) But of course any of those ideas are impossible now.
Raymond > Huh?
I want a function along the lines of:
string CreateCommandLine(IList<string> args);
that takes a list of args, and will convert them into a single string, such that when it is parsed and unquoted by the .NET runtime or the pre-main() argc/argv[] generator in the C runtime, they’ll get out what you stuffed in.
I also want an S.D.Process.Create() overload that takes an IList<string> argument for the parameters to pass on the commandline and calls CreateCommandLine() (or whatever) itself so I don’t have to.
Yes, you need to keep the single-string variant in place for executing legacy Win32 programs that do their own completely non-standard parsing of GetCommandLine(), but:
(1) you can’t remove functions anyway due to back-compat issues, and
(2) if a program parses the command line in a non-standard manner, how do they expect other programs to pass them arguments in a manner they can understand properly at the moment?
[But that assumes you “know” that the program you’re launching uses the .NET command line parser. It’s now reduced to a private agreement between two programs, not something the kernel should get involved in. If the .NET Framework wants to make it easy for one .NET-based program to talk to another .NET-based program, then more power to them. -Raymond]
I thought we were talking about command line parsing, not kernels. AFACT, all the kernel needs is the name of the binary to exec and a list of parameters.
Raymond,
All he wants is an API change in .NET. That .NET API would then take the array and quote it ITSELF, when passing it to the command line of any old app. The contract of the API would be that each item in the array as quotes wrapped around it; and that’s it.
I think this is reasonable.
Sometimes.
> Path names should not be quoted.
Sometimes.
> Many places in the registry allow you to
> specify a command line,
Right. And the first part of the command line might be intended to be a path name, in which case sometimes the path name does need to be quoted. The first time I had trouble with this was in [unnameable 2000], where a registry entry for Word had things right but one for [unnameable two words] was missing quotation marks. The resulting error message was really unhelpful, but you can sort of see how I also was unaware of the need for a “sometimes” in the solution.
I think most of your article makes things pretty much clear and I thank you for that explanation. Your blanket statements prevent that clarity until you add “sometimes”.
BryanK > Not for Win32. They could have fixed it for .NET, but no. It’s still stupid –
http://msdn2.microsoft.com/en-us/library/system.diagnostics.processstartinfo.arguments.aspx
At least for Win32, programs were supposed to use GetCommandLine() instead of argc/argv[]. So although the command line was sent to new processes as a single string, it was received by them that way too.
In .NET*, command lines are still sent as a single string, but they’re picked up as an array of strings. Is there any documentation as to how the one is transformed to the other? Neither the link above, or any books I’ve read, or any other MSDN pages I’ve been able to find shed any light on the quoting/escaping rules used by .NET. The ProcessStartInfo link above doesn’t help, and neither do these two on the other side of the Create() call – Main():
http://msdn2.microsoft.com/en-us/library/cb20e19t.aspx
http://msdn2.microsoft.com/en-us/library/acy3edy3.aspx
Is there any written down set of quoting rules? Better yet, is there a function somewhere that will turn an array of strings into a properly quoted command line? Or is it just "well, figure out something by yourself and MS will guarantee that whatever currently works will continue to do so."?
I especially like how none of the examples pass more than one argument to any new process, and none have spaces in the argument they do pass.
* Nitpickers corner**: Yes, I know this is whining about .NET in a not .NET blog. But the points I’m making/questions I’m asking are not directed at *you* Raymond. The whines are more towards MS in general, and the questions are to the other readers of your blog. I’m not expecting or asking *you* to do anything about it. But, it does relate to command lines, so I hope it’s on-topic enough to not be a complete waste of space.
** Hey, I kind of like this nitpicker’s corner thing. :)
anonymous> Quoting is not idempotent. Not sure where you get that idea. Quoting a quoted string does not give you the same string back. Quoting it again definitely doesn’t.
What do you mean, don’t think complicated? It *is* complicated. I’m asking for it to be *simplified*. Passing command line arguments as an array of strings, in the same format they’re accessed by the receiving program is *simpler* than trying to figure out how to squish them all together yourself.
Isn’t it a WTF that all other OS’s except Windows already figured out how to pass (and parse) command line about 10 years ago (see man exec())
> > Many places in the registry allow you to
> > specify a command line,
> Right. And the first part of the command line might be intended to be a path name
> does need to be quoted.
Er, that’s a kind of a tautology. "Command lines" need to be quoted. Doesn’t matter whether you’re talking about parts of the command line that represent paths, or parts of the command that represent anything else.
I suppose the "sometimes" could come in when you’re talking about command lines that don’t contain spaces. But the nitpickes corner would look a bit silly:
"Command lines need to be quoted*.
*except when they don’t."
Adam: in “10 years” I mean “long time ago”, not “in 1997”.
To the point, Unix does not have the same problem. All programs get
argc/argv[]; exec() receive separated list of parameters; and all this
have nothing to do how you put quotes (or escape chars) in shell.
[I agree that it is reasonable. But what does it have to do with knowing the difference between paths and command lines? -Raymond]
It’s to do with the fact paths should just be paths. Whether they’re arguments to CreateFile() or CreateProcess() or S.D.Process.Create().
You shouldn’t /have/ to quote paths when passing them as a command line to CreateProcess(), as you shouldn’t have to pass a command line to CreateProcess(). That you do is a historical artifact of Dos/Win32, but one that could have been fixed for .NET. You should be able to pass all arguments, including argument 0 (the program name) as distinct unquoted strings, and not care if they happen to be paths. Then this wouldn’t be a problem going further forward.
Note that this *doesn’t* require a change to NtCreateProcess()*. If the kernel uses a single parameter for the command line, fine. But the userspace API could be a bit smarter, providing quoting for you. As an advantage, if it quoted the part that the kernel needs (arg 0) then the kernel doesn’t have to guess so much. If you pass “c:program filesfoo.exe” as arg0, userspace should quote that for you so that NtCreateProcess() doesn’t even look for “c:program”
Yosi > Seeing as Unix exec() is documented as having passed arguments separately in the original programmer’s manual dated November 1971, you should have said “over 35 years ago”.
set dir=~/vim backups
set makeprg=”~/vim scripts/m” %
One of these settings uses quotation marks; the other doesn’t. I’m sorry I motivated the discussion with Windows. Now I understand that I should’ve used unix. -Raymond]
[All programs receive separated list of parameters, but somebody has to convert a command line into that list. -Raymond]
Yeah, and it’s a royal pain to do so, but somehow not as bad as you describe for windows. Also, it’s a small matter to change your shell if it bothers you enough.
[unix has the same problem. You don’t pass quotation marks to open(2); you do pass them to sh(1)]
open is an API call, sh is a command interpreter. You really don’t want your API calls deciding what you really meant, do you?
Nitpick:
You lie Raymond!
"I claim that Microsoft employees are perfect"
See you did say it!
:)
[Nitpick. s/sh(1)/system(3)/. Happy? I’m assuming my readers are smart enough to make these phenomenal leaps of logic. -Raymond]
system(3) is designed to run sh(1).
“system() executes a command specified in string by calling /bin/sh -c”
I’m assuming that Raymond is smart enough to make this phenomenal leaps of logic.
Yosi: What’s your point anyway? If it’s just to prove how smart you are, we don’t care. If it’s just to prove how much better Unix is than Windows, we don’t care. If it’s to annoy Raymond to the point that he stops blogging, then please stop.
Well, system() in perl takes a list which then gets passed straight through to the exec – this is considered reccomended over the single argument form for all the reasons already discussed.
To be honest I’m kinda surprised the windows equivalents don’t work the same way … wonder how win32 perl handles it.
Sometimes you don’t know if the specified path is a command line or a simple path.
And often the application and/or apis can handle an extra pair of quotation marks anyway.
Sadly, I’ve seen many examples where apps are constructing command lines with unquoted paths, and if those paths has spaces they must be quoted by the user to compensate for the buggy app. Not a good solution I admit.