Wrapping some other scripting language inside a batch file

Date:August 4, 2017 / year-entry #177
Tags:code
Orig Link:https://blogs.msdn.microsoft.com/oldnewthing/20170804-00/?p=96765
Comments:    33
Summary:Polyglot to the rescue.

Nobody actually enjoys batch programming, but sometimes you can get away with writing in a language you like better while retaining the .cmd extension. Still, that leaves you having to get the extension for that language registered on your target machines, which can be tricky for xcopy-style deployment scenarios. The solution then is to use a polyglot header that is valid both as a batch file and in your target language. The header re-invokes the target language interpreter with the batch file itself as input.

Note: That this trick isn't necessary if you can associate the file extension with the scripting engine. So you don't need to do this polyglot nonsense with, say, PowerShell scripts, because the .ps1 extension is already associated with powershell.exe (where available).

The general shape of a polyglot header is

@rem prefix stuff
@⟨interpreter⟩.exe ⟨interpreter options⟩ "%~f0" %*
@goto :eof
⟨suffix stuff⟩
⟨the script itself⟩
⟨trail stuff

Prefixing each line with an at-sign prevents it from being echoed. The first line is a comment, which lets you stick arbitrary goop in front, in order to swallow up the @rem and make the rest of the header invisible to the interpreter.

The "%~f0" %* sequence looks like line noise, but it's actually a batch file idiom for "A quoted, fully-qualified path to the batch file, followed by the original arguments." The %~f0 part uses the tilde operator to build up a full path to the %0 (which is the batch file itself). And %* is a batch variable that expands to the arguments passed to the batch file.

Anything after the @goto :eof is ignored by the batch interpreter, so you can add language-specific suffix stuff to finish up the "start ignoring this" goop you set up on the first line.

Finally, in rare cases, you might need to add trail stuff at the end of the script to balance out anything you set up in the header, like closing an open set of braces. This is rare because you usually close them up in the ⟨suffix stuff⟩ part.

Okay, now that we see the general shape of a polyglot header, let's look at some examples.

Perl

@rem --*-Perl-*--
@perl.exe -x "%~f0" %*
goto :eof
#!perl
⟨perl script⟩

This isn't a proper polyglot because we're running perl in a special mode which is not the default (-x). But hey, we're trying to get things done, not solve some theoretical puzzle, so running perl in a special mode is just fine if it gets the job done.

Note that if you want other special command line options to be passed to perl, you can sneak them in with the -x. For example, you might ask for -Sx to get poor-man's command line switch auto-parsing.

The leading comment --*-Perl-*-- is not used by either perl or the command processor. It's there by tradition, so that when emacs users load the script into the editor, it will be detected as a perl script, and perl-specific editing commands will be enabled.

JavaScript

@if (1 == 0) @end /*
@cscript.exe /E:jscript /nologo "%~f0" %*
goto :eof
*/
⟨JScript script

Instead of using @rem, the JScript polyglot header uses an @if conditional that is never true. This was chosen so that the opening syntax of the file matches that of JScript conditional compilation, and the entire header gets gobbled up as a false conditional followed by a big comment. Note that JScript conditional compilation is a Microsoft extension, but since cscript runs the Microsoft JScript engine when you specify /e:jscript, it's okay to use it anyway.

Bonus chatter: Sometimes I miss the EXTPROC directive from OS/2's command interpreter, then I realize that it really only solves half of the problem (getting the command interpreter to hand control to another scripting engine), and doesn't solve the other half (getting the scripting engine to ignore the start of the batch file). The additional restriction that EXTPROC appear on the first line of the batch file makes it harder to work the first line into valid code in your target language.

Bonus chatter 2: JScript is probably the most convenient alternative scripting language because, while it may be the world's most misunderstood programming language, it's nevertheless immeasurably better than batch. And it has come preinstalled since Windows 2000, so your script will work on pretty much any Windows computer of modern interest. The downside is that the version of JScript used by cscript is ancient.

PowerShell is very nice, but it wasn't standard-issue until Windows 7. With the retirement of Windows Vista, we are finally in a situation where all supported versions of Windows come with PowerShell. It took eight years, but we made it. (Note that you can't run PowerShell scripts by default. You have to go in and change an administrative setting first.)

So maybe, if you're lucky, you may be able to declare the end of the era of suffering with batch files. I can more confidently say that the suffering of Batch File Week is now over, at least for now.

Bonus content: Here'a Web page which demonstrates various batch file string manipulation operations.


Comments (33)
  1. Bartosz Dziewoński says:

    Fascinating and terrible.

    For science, I threw together a Ruby version. The `@rem = ‘…’` sets an instance variable called ‘rem’, conveniently gobbling up the polyglot header.

    @rem = ‘
    @ruby.exe “%~f0” %*
    @goto :eof

    puts ‘Hello from Ruby!’

  2. Peter Doubleday says:

    Finishing the week on a high note!

    I genuinely didn’t realise that batch files have an equivalent mechanism for the *nix shebang. And although I suspect the general reaction to “it has an emacs mode hint” would either be total indifference, or a demand for the same thing in vim … it works for me.

    Thanks for a very entertaining week.

  3. George Byrkit says:

    Looks like the ‘change an administrative setting first’ webpage regarding PowerShell has been deprecated. Nothing there except a suggested link to the ‘new’ documentation. A lot of the content has been removed and/or reorganized, with a number of auto-redirects, and no obvious description of the setting that needs to be changed. Or else I’m missing something…

  4. rahuldottech says:

    Uhm. I think I might just be the only human who actually enjoys scripting in Batch. I’ve always found it challenging and fun :P

    While this is certainly very cool (and who actually liked Vista anyway?), I’m always going to remember all those days spent in the school computer lab, trying to get huge complicated scripts to work. I’m 16 years old and batch was one of the first languages that I started with, so it has a special place in my heart.

    I’ve done stupid stuff like made a renderfarm, a percent-encoded data decoder, an HTTP server, and a remote command deployment system

    Also, this WiFi Rickrolling system is something that I’m extremely proud of.

    Yeah, you’re not going be the first one to call me a monster. But honestly, getting stuff done in batch gives me a weird sense of satisfaction.

  5. pc says:

    Why would you need to do a batch-polyglot for JScript? Doesn’t .js default to running it with cscript or wscript? Is this in case you need it to use one or the other?

    While these are pretty nifty tricks, I’m trying to imagine “xcopy-style deployment scenarios” where you were limited to one file. Even for most the most limited deployment scenarios, having one file in your actual language and then a separate batch file to run it with your desired arguments and switches and environment and such would work fine, right? What am I missing?

    1. Rick C says:

      This is for some case where you need to run your script as a .cmd, perhaps because the script is started off by something you can’t change. You can rewrite the script (in this example) in Perl or JS, but still fire it off as if it were a batch job.

    2. .js files default to wscript, but if you’re running it from the command prompt, you probably want cscript.

      And the single-file scenario is the “Copy this script to your machine and run it and send me the output.” As opposed to “Copy these two files to your machine, make sure they are in the same directory, and run the first file, not the second one.”

  6. Pierre B. says:

    “Nobody actually enjoys batch programming, but sometimes you can get away with writing in a language you like better while retaining the .cmd extension. Still, that leaves you having to get the extension for that language registered on your target machines, which can be tricky for xcopy-style deployment scenarios.”

    I don’t understand how you do that and some search on the web did not come up with anything of value.

    What do you need to register where to get a script in another language while retain the .cmd extension?

    (The only alternative I see is that the quoted sentences are actually independent of each other even though they sound related the way they’re written?)

    1. pc says:

      The point is that you don’t need to register a separate extension, you can leave your file as a .cmd file, which Windows will run as a batch file, and when interpreted as a batch file it runs the interpreter of the language you actually want, which loads up that same file in that language.

      Which is a neat trick, and you can find on the web impressive polyglots where one file does something meaningful in a bazillion languages. Raymond here is trying to describe why such a thing may be practically useful in some particular circumstances.

    2. Antonio Rodríguez says:

      What Raymond is talking about is registering extensions for *other* script languages. For example, you can copy-deploy Perl by copying perl.exe to System32 (or anywhere within the path). But if you want be able to run (“open”) *.pl files with Perl, you have to register the .pl extension and associate it with perl.exe.

  7. Stuart says:

    So does that mean that in future we’ll get a PowerShell Week instead of Batch File Week? Or would that get folded into CLR Week?

    1. I don’t know enough about PowerShell to write intelligently about it. I look forward to somebody else’s PowerShell week.

  8. Antonio Rodríguez says:

    Instead of using a polyglot header, I’d prefer to split the alternate language section into its own file with its correct file extension (what an idea!) and just use a simple batch file to call it. For example, if I wanted to use Python, I’d put all the Python code in a file called script.py, and would create a script.bat file with this single line:

    @python “%~d0%~p0%~n0.py” %*

    Advantages: leaving the Python section in its own file with the correct extension, and using a generic file to bootstrap any Python script (which can then be used as a template). Disadvantages: it uses two files instead of one, and the string of tilde operators (drive-path-filename without extension-.py extension) could be posted to the yearly Obfuscated C Contest if it were valid C O:-) . But, hey, I don’t think there would be much need to debug this, right? Right? (Famous last words…).

    1. Ramón Sola says:

      “%~dpn0.py” is shorter and cleaner, and I think it achieves the same result.

      1. Antonio Rodríguez says:

        Right. It is documented, and works both in Windows XP and Windows 7. It is also shorter than my solution, but not more readable. So it is equal or better in any respect :-) . Thanks.

    2. ErikF says:

      But with Raymond’s method, you only need to copy/download one file. That’s an awfully useful feature for a bootstrapping script.

      1. Antonio Rodríguez says:

        Remember that you are writing code in a foreign script language. Even in the case of Raymond’s method, you need to copy also the language interpreter (for example, phyton.exe) and the required libraries/extensions.

        The exception is JScript. But, as somebody else mentioned, every Windows version which includes JScript also comes with the relevant file associations, so you can run a .js script directly, without polyglot headers or launcher scripts.

        1. Andrew Cook says:

          You’re not limited to the shebang line here; you can add other commands in Batch before the call to your interpreter of choice. Including using BitsAdmin or FTP to download the interpreter, or NET USE at a WebDAV server and let Windows handle downloads for you.

          1. Ben Voigt says:

            Naturally, those tricks also work for fetching a separated script file.

  9. DWalker07 says:

    Ah, so “prefix stuff” in the template means “an indicator for the programming language or interpreter of choice to start ignoring stuff”. It took me a while to get that.

  10. James says:

    The normal Python interpreter also has a `-x` command-line option for this, but it skips only a single line of the file, so you have to jam everything into one line:

    python.exe -x “%~f0” %* & goto :EOF

  11. alegr1 says:

    :# This file contains BASH and CMD scripts
    :<<"::CMDLITERAL"
    @echo off
    goto :batch_file
    ::CMDLITERAL

    echo Bash invoked!

    exit 0

    :batch_file
    echo CMD invoked

  12. voo says:

    “Note that you can’t run PowerShell scripts by default. You have to go in and change an administrative setting first.”
    Indeed, rather annoying and not really a security feature since you can just write a batch script that evokes the PowerShell script and tell it to ignore the execution policy.

    Basically just a simple batch file that contains:
    powershell.exe -NoProfile -ExecutionPolicy Bypass -file realScript.ps1 %*

    Which is how Microsoft itself ships a lot of scripts these days (e.g. the TFS build agent start script). But now I have to see how to create a Polyglot PowerShell script – I’m sure there are already solutions for this on the web but it seems like a fun challenge!

    1. ender9 says:

      > But now I have to see how to create a Polyglot PowerShell script

      Not easy, because PowerShell refuses to run scripts that don’t end in .ps1 extension.

  13. mihailik says:

    I present to you THE DANGEROUS CLUBMAN
    — because of its leading character

    0</* :
    @cscript/nologo /E:jscript %~f0 %*&@goto:EOF&:*/0

    1. laonianren says:

      Bravo!

  14. Marvy says:

    At first I was thinking “there is no hope of this parsing correctly in any sane language”.

    Then I saw your Perl version, and I thought “fine, but that’s cheating, the -x is specially designed for such things, but most languages don’t have an equivalent”.

    Then I saw JScript, and I thought “fine, but you got lucky there that the @ character does what you want in both cases”.

    Then I saw Ruby in the comments and I thought “this is getting crazy, but you can’t keep getting lucky forever; I bet no one will manage to get it to work for Python”.

    Then I kept reading the comments and saw someone got it to work for Python. Maybe this is more versatile than I thought.

  15. Neil says:

    I use the `for` command for case-sensitive key-value lookup, something like this:

    for %%a in (Tue.s Wed.nes Thu.rs Sat.ur) do if %v%==%%na set v=%v%%%xa
    echo %v:.=%day

    (less readably you can switch the name and extension and add the . to the comparison instead to avoid having to remove it from the result)

  16. Martin Ba. _ says:

    “So maybe, if you’re lucky, you may be able to declare the end of the era of suffering with batch files.” – if only Powershell wasn’t such a POOR replacement for batch files. PS is a nice shell+scripting environment. But man, the way it interfaces with the outside world (that is, executables and othe scripts or batch files) is so arcane, I’d rather write batch files half the time.

    IMHO, MS has managed to replace batch hell with PS hell. You decide where you’d rather be.

  17. Stefan Kanthak says:

    Why don’t you use the more appropriate “exit /b” instead of “goto :eof” to exit the batch script?

      1. Erkin Alp Güney says:

        He might have forgotten he asked to the Microsoft’s backward compatibility specialist.

  18. Erkin Alp Güney says:

    WSL would make this more complicated.

Comments are closed.


*DISCLAIMER: I DO NOT OWN THIS CONTENT. If you are the owner and would like it removed, please contact me. The content herein is an archived reproduction of entries from Raymond Chen's "Old New Thing" Blog (most recent link is here). It may have slight formatting modifications for consistency and to improve readability.

WHY DID I DUPLICATE THIS CONTENT HERE? Let me first say this site has never had anything to sell and has never shown ads of any kind. I have nothing monetarily to gain by duplicating content here. Because I had made my own local copy of this content throughout the years, for ease of using tools like grep, I decided to put it online after I discovered some of the original content previously and publicly available, had disappeared approximately early to mid 2019. At the same time, I present the content in an easily accessible theme-agnostic way.

The information provided by Raymond's blog is, for all practical purposes, more authoritative on Windows Development than Microsoft's own MSDN documentation and should be considered supplemental reading to that documentation. The wealth of missing details provided by this blog that Microsoft could not or did not document about Windows over the years is vital enough, many would agree an online "backup" of these details is a necessary endeavor. Specifics include:

<-- Back to Old New Thing Archive Index