Date: | January 4, 2007 / year-entry #4 |
Tags: | other |
Orig Link: | https://blogs.msdn.microsoft.com/oldnewthing/20070104-00/?p=28513 |
Comments: | 20 |
Summary: | Bangs or bells. |
Here's a minor mystery: echo •
That last character is U+2022. Select that line with the mouse, right-click, and select Copy to copy it to the clipboard. Now go to a command prompt and paste it and hit Enter. You'd expect a • to be printed, but instead you get a beep. What happened? Here's another clue. Run this program. class Mystery { public static void Main() { System.Console.WriteLine("\x2022"); } } Hm, there's that beep again. How about this program: #include <stdio.h> #include <windows.h> int __cdecl main(int argc, char **argv) { char ch; if (WideCharToMultiByte(CP_OEMCP, 0, L"\x2022", 1, &ch, 1, NULL, NULL) == 1) { printf("%d\n", ch); } return 0; } Run this program and it prints "7". By now you should have figured out what's going on. In the OEM code page, the bullet character is being converted to a beep. But why is that?
What you're seeing is
"But converting a bullet to 0x07 is clearly wrong. I mean, who expects a printable character to turn into a control character?" Well, you're assuming that the code who does the conversion is going to interpret it as a control character. The code might treat it as a glyph character, like this: // starting with the scratch program void PaintContent(HWND hwnd, PAINTSTRUCT *pps) { HFONT hfPrev = SelectFont(pps->hdc, GetStockFont(OEM_FIXED_FONT)); TextOut(pps->hdc, 0, 0, "\x07", 1); SelectFont(pps->hdc, hfPrev); }
Run this program and you get a happy bullet in the corner of the
window.
The
The
(Postscript: You can see this happening in reverse from the
command prompt.
Then again, since this problem is itself a reversal, I guess
you could say the behavior is happening in the forward direction now...
Type |
Comments (20)
Comments are closed. |
Ctrl+G=BEL
Can’t reproduce on XPSP2. I’m copying it into CMD.EXE console window, and echo actually copies the bullet character.
Just tried "echo Ctrl+G", and it beeps nicely, so the internal speaker is OK.
Whether this issue is reproducable depends on the settings for the command line window. Since I set my default cmd to Lucida Console some time ago it can actually display Unicode characters directly on the console. If, however, the window is set to use a raster font it only has the OEM codepage and will convert the bullet character into BEL which, on echo, will beep.
Raymond,
Isn’t that behavior due to the fact that ASCII 1 is ☺ ?
Interesting. A little experiment with a command prompt (XPSP2 again) reveals the following behaviour:
Most control characters display as ^char and echo as glyphs. ^C aborts. ^G "echos" as a beep only. ^H and ^I are interpreted by the line editor as backspace and tab, respectively. ^J seems to be ignored completely. ^K and ^L echo as glyphs only amongst other input; on their own they just give the "ECHO is on." message. ^M is interpreted as ENTER. ^S suppresses the next character. ^Z is the end-of-input character; anything after it on a line appears in the line editor but is not echoed.
Bored? Moi?
jim missed out ^@, which seems to act as an end-of-input-on-the-current-line character, then prompts for more input on the next line. e.g.:
C:>echo hello^@world
More? dolly
hellodolly
jim’s comment about ^Z reminded me of the article "Using the echo command to remember what you were doing." (http://blogs.msdn.com/oldnewthing/archive/2004/04/29/123012.aspx).
Instead of pressing the "home"-button and then typing "echo " you can just press the "home"-button and type ^Z (i.e. press CTRL+Z) and get the same effect. 4 keys less to type :)
Nish: ASCII 1 is not ☺, it is Start Of Heading (SOH). It may be ☺ in IBM codepage 437, but that’s a different thing entirely. Whether the ‘C0 Controls’ displayed as those symbols or performed their control function depended on which API you were using to display text. If you use the raw ‘display a character’ BIOS API or write directly into the display buffer, you get the display character; if you use the ‘display a string’ API the character is interpreted.
I wrote a library to help port from Symbol Series 3000 (DOS-based with an extended largely-IBM-compatible BIOS) to Windows CE. My implementation of the ‘display a string’ API currently doesn’t emulate a teletype, so character code 7 produces a bullet rather than a beep. The C0 controls weren’t actually used much in Series 3000 programs so we tend to fix the program rather than add teletype support to the library.
Mike Dimmick beat me to explaining ☺, but to Carlos: CTRL-@ is equal to character code 0 (zero), which is ASCII NUL. The use of zero in C as a string terminator might have something to do with the behaviour you’re seeing… or it might not :-)
If you look at an ASCII table, you can see that @ is character 64, and the action of the CTRL key is (nominally) to reset the sixth bit, so that CTRL-@ -> 0 == NUL, CTRL-A -> 1 == SOH, CTRL-G -> 7 == BEL and so forth:
http://www.asciitable.com/
Somehow I had completely forgotten that there were glyphs in those control characters in the olden days. You had me scratching my head for a few minutes thinking "why would a BEL be a bullet?!".
I’m sure lots of people of a suitable age remember making silly little demos/games involving those smiley face characters, the card symbols and the musical note. Those arrow characters were quite useful for scroll bars, too. I guess it would have been a waste not to use those characters for glyphs too, since those <32 values could be written to video memory just fine.
(jim) >>"^J seems to be ignored completely …^M is interpreted as ENTER."
That is interesting. The ^J should be a Linefeed, and ^M should be a Carriage Return. Of course, with windows files, a line ending is noted with the CR LF combination. I guess the enter key in cmd.exe only sends a CR?
So ‘echo ^D’ prints a diamond. D for diamond, it all makes sense now!
Regarding andy’s comment, if I want to save what I’ve typed at a prompt, I usually don’t press HOME at all. What I do (that works with Bash and CMD) is just press CTRL-C. This cancels and gives me a new prompt, but leaves whatever I had on the previous line intact. Very handy.
On the other hand, if I try, also on XPSP2, it copies the bullet *and* beeps.
[But it doesn’t go into the command history, which is might inconvenient. -Raymond]
Ah, true enough. I didn’t think about that.
An alternative that does require use of HOME is to type a colon at the start of the line. It gets treated the same as a REM and does stay in the command history.
> GetStockFont(OEM_FIXED_FONT)
I think you need to go into Control Panel and set your system’s default language for non-Unicode programs. In fact even if your program IS Unicode I think you have to do that setting. I should read and experiment to see if AppLocale will take care of it. Anyway just getting the default code page changed doesn’t get the default font changed.
> Type echo ^A where you actually type Ctrl+A
> where I wrote ^A. The result:
The result is a quotation mark. I think in a command prompt window the command “mode con cp select=” some number will adjust the font together with the code page.
In the command window properties if the Font is chosen as Lucida it would print a bullet, otherwise if raster fonts is chosen it would sound a bell as Raymond noted.
As pointed by Johannes Rössel in 5th reply.
Way back near the beginning of development of TFS version control, which was called Hatteras back then,
Happy New Year to Raymond. Perhaps associating bullet to beep is intentional ;)