The first word on the command line is the program name only by convention

Date:May 15, 2006 / year-entry #166
Tags:other
Orig Link:https://blogs.msdn.microsoft.com/oldnewthing/20060515-07/?p=31203
Comments:    28
Summary:The format of the command line returned by GetCommandLine is "program args", but this is only a convention. If you pass NULL for the lpApplicationName to the CreateProcess function, then the CreateProcess function will treat the first word of the lpCommandLine as the program name. However, if you pass a value for lpApplicationName, then that...

The format of the command line returned by GetCommandLine is "program args", but this is only a convention. If you pass NULL for the lpApplicationName to the CreateProcess function, then the CreateProcess function will treat the first word of the lpCommandLine as the program name. However, if you pass a value for lpApplicationName, then that string determines the program that is run, and the string passed as the lpCommandLine is not used for that purpose.

This means that if somebody runs your program with the following parameters to the CreateProcess function

lpApplicationName  =  "C:\Path\To\Program.exe"
lpCommandLine  =  "slithy toves"

then when your program calls the GetCommandLine function, it will get the string "slithy toves", which doesn't give your program much help at all in determining its own name or location.

If your program needs to determine its own name and location, use the GetModuleFileName function, as I noted some time ago.

What is the point of letting a program specify something different as the first word on the command line from the actual program being run? There isn't much point to it in Windows, although it is used to greater effect in unix, where you can run a program under various "alias" names, executing one program but lying to it and putting a different name at the start of the command line. Some programs are specially designed to be run this way and alter their behavior depending on the "alias" name they were given. For example, the visual editor runs in screen mode if its name is given as "vi" but in line mode if its name is given as "ex".

Although extremely few Windows programs use this quirk (I am not aware of any myself), the behavior nevertheless is supported, and you need to be aware of it when writing your own program, even if you don't intend to use it.

For example, if you forget to repeat the program name on the command line and create the process like this

lpApplicationName  =  "C:\Path\To\Program.exe"
lpCommandLine  =  "arg1 arg2"

then when that program runs, you will most likely see it ignore the arg1 because it thinks that arg1 is just the program name. If that program is a console program that uses the C runtime startup code, then it will receive its parameters as

argv[0]  =  "arg1"
argv[1]  =  "arg2"

As I noted earlier, most console programs merely ignore their argv[0] since that slot is just the program name. (In this case, it's the alias program name, but the program being run doesn't know that.)

Similarly, if the program is a Windows program that uses the C runtime startup code, then the C runtime startup code will merely skip over the first word on the command line, passing "arg2" to the WinMain function as its lpCmdLine.

What was the point of all this discussion? Two things. First, that if you are launching other programs and passing an explicit lpApplicationName, then it behooves you to format the command line in a compatible manner. Otherwise, the results may not be what you expect. Second, that you as a program should not use the first token on the command line to control any security decisions since the value is controlled by the launcher and need not have any connection to reality.


Comments (28)
  1. KJK::Hyperion says:

    Actually, the string returned by GetModuleFileName can be spoofed, too. Not quite as easily as changing the command line, but still a writable string in user-mode memory provided by the caller

  2. Gary Niger says:

    Correct me if I’m wrong, but don’t those issues apply to the exec* family of functions as well?

  3. Michael says:

    Yes, but in order to spoof the string returned by GetModuleFileName the attacker needs write access to the memory in your process.

    In which case, you’ve already lost.

  4. Jonathan Payne says:

    Perforce uses this to decide if the Perforce server should run as a command line application (p4d.exe) or a service (p4s.exe).

  5. Cooney says:

    Actually, in Unix, files are distinct from their names, so the file for vi can be the same one as that for ‘ex’. There isn’t any sleight of hand or aliasing occurring – vi just has 2 names.

    I still think it’s a bad idea to allow this sort of inconsistency. What does it buy?

  6. Andrew Taylor says:

    This trick is possible in Unix, but it is not usually used for the vi/ex example you give.  The screen mode is determined by the value of argv[0], but vi and ex are generally launched with argv[0] equal to the binary name.

    It is the binary name itself is changed: usually ex is a symbolic link to vi.  This would still be possible even if the first argument wasn’t independent of the binary name.

  7. Anonymous says:

    Try this:  type in the command line

    c:"program files"resetofpath.exe

    Quote only parts of the path that contain blanks like "program files" above.  

    Guess what will be the value of argv[0] and argv[1]?

  8. bash says:

    > What does it buy?

    Disk space ?

    My router with embedded linux does not have 20GB of disk.

  9. microbe says:

    To make it very clear, the rule should be "the first parameter in lpCommandLine is always the program name" no matter what lpApplicationName is.

    It is be a convention. It is a rule.

  10. Alex Lambert says:

    Release notes for sendmail 8.8:

    "8.8.3/8.8.3 96/11/17

    SECURITY: it was possible to get a root shell by lying to sendmail about argv[0] and then sending it a signal.  Problem noted by Leshka Zakharoff <leshka@leshka.chuvashia.su> on the best-of-security list."

  11. antonio vargas says:

    the most popular example for using argv[0] for switching the operating mode is on the package busybox, where a whole lot of 100 programs are packed inside one executable. on instalation you just make 100 symlinks to the real executable and there you go. the programs jsut are hand-optimised-for-size versions of the usual unix tools, and very popular for space-constrained distros

  12. Dean Harding says:

    In my last job, we used to work on a windows port of a unix server application (until we rewrote it to be more "windows-like" using thread pooling and stuff, anyway).

    The unix version had used that symlink trick to get one executable perform many tasks (I think there ended up being around 7 or 8 symlinks to the one file). So in the port, they just copied the actual file 7 times to the 7 different names. Not exactly efficient, but the actual implementation of the the functions was in a separate DLL so the .exe was actually only a couple of KB anyway.

  13. Norman Diamond says:

    I thought Unix installations of vi and ex made them hard links to the same file, not using symlinks.

    Meanwhile I read that plans for Vista include letting some users create symlinks.  If I recall correctly some users won’t have privileges to create symlinks even within their own directories.  When symlinks exist their behaviour appears to be guided by Unix symlinks though of course Win32 APIs differ from Unix APIs so they still can’t be made identical.

  14. meo says:

    You know, you just quoted the guy wrong and with the most annoying word: noone.

  15. Ulric says:

    >> What does it buy?

    >Disk space ?

    >My router with embedded linux does not have

    >20GB of disk.

    Ridiculous geek talk…  The reason why some unix programmer put everything in one excutable and used different names to invoke it in different way is because … they could, they thought it was cool, and it saved them some typing at the command line when calling these tools.

    The content of a unix distribution isn’t about saving disk space, it’s litteraly a dump of hundreds of executable and legacy tools, and there are a lot of other things taking disk space, like allocation size, config files, etc. It was never made to be compact. If they were trying to save space, they would have *dropped*, for example, the old editors that no one uses, and shipped just their new editor.

    The only thing using symlinks buys you is saving some typing at the command line, and accomodate legacy reflexes or batch files.

  16. ... says:

    > Ridiculous geek talk…  The reason why some unix programmer put everything in one excutable and used different names to invoke it in different way is because … they could, they thought it was cool, and it saved them some typing at the command line when calling these tools.

    It’s not geek talk. You know a router with less EPROM space is a cheaper router to produce. A cheaper router can be sold.. uh cheaper. A cheaper product sells more. More sells means more money.

  17. x says:

    > And the use of symbolic link is usually related to changing "old practice" to "new practice" in a graceful way. (For example source code of  "/usr/src/kernel/kernel-<version>" is linked to "/usr/src/linux" to conserve backward compatibility with other linux packages and *nix distros)

    Which reminds me on why we have two copies of notepad ..

  18. cheong00 says:

    > If they were trying to save space, they would

    > have *dropped*, for example, the old editors

    > that no one uses, and shipped just their new

    > editor.

    Not exactly always, as some programs may still be dependent on the "old editors that noone uses".

    Just consider how many people still use "edlin" when "edit"(qbasic) is available at the ages of DOS.

    And the use of symbolic link is usually related to changing "old practice" to "new practice" in a graceful way. (For example source code of  "/usr/src/kernel/kernel-<version>" is linked to "/usr/src/linux" to conserve backward compatibility with other linux packages and *nix distros)

  19. Geek says:

    "Ridiculous geek talk…"

    Is it hell.

    At the time that ex/vi were introduced, disk usage was a big deal.

    The design remains the same because to change it now would achieve nothing.

    And UNIX systems retain all those esoteric binaries, because at this stage it "costs" virtually nothing, and each one is used by somebody somewhere for their shell script.

  20. Archangel says:

    > Ridiculous geek talk…  The reason why some unix programmer put everything in one excutable and used different names to invoke it in different way is because … they could, they thought it was cool, and it saved them some typing at the command line when calling these tools.

    One assumes you’re thinking of Busybox, which ties up many many functions into one binary. It’s got nothing to do with saving typing (in fact it doesn’t save any, because they’re the same names as the individual binaries) but it has an awful lot to do with saving space. You can sniff at that, but when you need to fit a semi-functional system into an initial ramdisk image that’s only a few MB, it’s quite helpful.

    >>The only thing using symlinks buys you is saving some typing at the command line, and accomodate legacy reflexes or batch files.

    Or an awful lot of convienience – as someone suggested before, symlinking /usr/src/linux to /usr/src/linux-version, to facilitate multiple source trees at once. Or backwards compatibility if you move libraries to a new location. Or lots of other things I can’t think of at present.

    Just because you can’t see a use for something doesn’t mean it doesn’t have one.

  21. Centaur says:

    +1 Interesting. I never thought about it this way.

    Some tools ported from Un*x behave this way in Windows. E.g. bzip2 compresses files by default, bunzip2 decompresses, and bz2cat decompresses them to stdout, and all that is in a single binary which one can copy or hardlink.

    I assume the historical reason for this behavior is that one binary means one instance of the C runtime, one instance of the command line parser, and one instance of the decompressor, and that people are more comfortable calling bunzip2 instead of bzip2 -d.

  22. BryanK says:

    Archangel — one more thing that symlinks let you do is get rid of DLL Hell.

    /sbin/ldconfig creates (or replaces) a symlink whose name is the "soname" of each library, pointing at the full name of the library.  So any program that links to the soname (which is everything, since that’s how /usr/bin/ld works) actually indirects through the symlink.  And upgrading the library because of (e.g.) a security vulnerability means that all you have to do is copy the new library over, then update the symlink, then restart the affected programs.  (Whether you remove the old library or not doesn’t make a difference; you can remove in-use files, but if you don’t, it won’t be used anyway.)

    Any upgrade to the library that preserves binary compatibility should retain the soname.

    There’s also a symlink named just libwhatever.so — that’s for compile time, and it should point at the newest library available (regardless of compatibility with old versions).  So you can link against plain old -lwhatever, and it’ll pick up the newest library on the system.  It’ll record the soname in the output file, though (not the unversioned name), so that future upgrades will stay compatible.

  23. Adam says:

    Brian> "Any upgrade to the library that preserves binary compatibility should retain the soname."

    Conversely, and more importantly, any upgrade to the library that breaks binary compatibility must change the soname.

    And under windows, any upgrade to a dll that breaks binary compatibility should change the dll filename.

    In my experience, the main thing that causes dll hell on windows is not the lack of sonames, it’s that many windows developers appear to be incapable of maintaining binary compatibility between "minor" releases of their own dlls.

    (The fact that some developers often fail to make sure that they don’t overwrite newer versions of a dll with an older one on application installation doesn’t help. But if it’s a 3rd party dll, the dll author can’t really be blamed for that one.)

  24. Richard says:

    Interestingly, if I have a C# app which outputs the contents of its args array, and execute it with something like the following:

    ::CreateProcess("theapp.exe", "argument1 argument2", NULL, NULL, FALSE, 0, NULL, NULL, &si, &pi);

    the .NET framework assumes that arg[0] is the app name and outputs only "argument2".

  25. [] says:

    It’s correct that arg[0] is not an argument but it’s either the name of the program or of the alias used to launch it.

    arg[0] is not a parameter, the whole post was about not making
    assumptions which can compromise security on the value of arg[0], not
    on the right and wrong of thinking of it as the program name.

    BTW “argv[0] = Program Name” is sanctioned by the ANSI C standard.

  26. Markus says:

    >> Ridiculous geek talk…

    I well remember the time i started to learn C programming under Unix on a 68020-machine with 8 terminals and 20 MB hard disk. When working on a larger project, i often had to ask fellows students to delete their a.out (compiled programs) because the disk didn’t have enugh free space left.

    (at the time, vi was a killer editor – so much better than the line-editors available under CP/CMS ;-)

    don’t forget, unix is *ancient*. in those times, every kB of disk space counted.

Comments are closed.


*DISCLAIMER: I DO NOT OWN THIS CONTENT. If you are the owner and would like it removed, please contact me. The content herein is an archived reproduction of entries from Raymond Chen's "Old New Thing" Blog (most recent link is here). It may have slight formatting modifications for consistency and to improve readability.

WHY DID I DUPLICATE THIS CONTENT HERE? Let me first say this site has never had anything to sell and has never shown ads of any kind. I have nothing monetarily to gain by duplicating content here. Because I had made my own local copy of this content throughout the years, for ease of using tools like grep, I decided to put it online after I discovered some of the original content previously and publicly available, had disappeared approximately early to mid 2019. At the same time, I present the content in an easily accessible theme-agnostic way.

The information provided by Raymond's blog is, for all practical purposes, more authoritative on Windows Development than Microsoft's own MSDN documentation and should be considered supplemental reading to that documentation. The wealth of missing details provided by this blog that Microsoft could not or did not document about Windows over the years is vital enough, many would agree an online "backup" of these details is a necessary endeavor. Specifics include:

<-- Back to Old New Thing Archive Index