Why is the default console codepage called “OEM”?

Date:August 29, 2005 / year-entry #243
Tags:history
Orig Link:https://blogs.msdn.microsoft.com/oldnewthing/20050829-00/?p=34403
Comments:    7
Summary:Because it once was, though no longer is.

Last year, we learned that the ANSI code page isn't actually ANSI. Indeed, the OEM code page isn't actually OEM either.

Back in the days of MS-DOS, there was only one code page, namely, the code page that was provided by the original equipment manufacturer in the form of glyphs embedded in the character generator on the video card. When Windows came along, the so-called ANSI code page was introduced and the name "OEM" was used to refer to the MS-DOS code page. Michael Kaplan went into more detail earlier this year on the ANSI/OEM split.

Over the years, Windows has relied less and less on the character generator embedded in the video card, to the point where the term "OEM character set" no longer has anything to do with the original equipment manufacturer. It is just a convenient term to refer to "the character set used by MS-DOS and console programs." Indeed, if you take a machine running US-English Windows (OEM code page 437) and install, say, Japanese Windows, then when you boot into Japanese Windows, you'll find that you now have an OEM code page of 932.


Comments (7)
  1. Great! I’ve always wondered why Microsoft uses the terms "ANSI" and "OEM", seemingly ignoring the fine registry at http://www.iana.org/assignments/character-sets

    Could you also please explain where the term "code page" came from? I find it a bit confusing, and prefer "character set" and "character encoding" as defined by the W3C at http://www.w3.org/International/resource-index

  2. Code Page is a term that IBM adopted for PC-DOS 3.3 when they ported their NLS system from their mainframes to PC-DOS (IBM did all the development work for PC-DOS 3.3 and PC-DOS 4.0).

    I’m not sure where IBM got the phrase from.

  3. Mihai says:

    For Christoffer:

    >> prefer "character set" and "character encoding"

    Code page is not mean the same thing with "character set" or "character encoding" (which are also not the same).

    Code page matches "coded character set" in the UNIX world.

    You can talk about the "Latin script" or "Cyrillic character set". This is the collection of characters, no numbers associated. Mac and Windows can use different char-to-number mappings (code pages/coded character sets) for the same character set (charset).

    This is why for a font you select the charset.

    Once you map characters to numeric values, it becomes a "coded character set" (or code page).

    Then you might have several ways to represent the same code page. These are "encodings". UTF-7, UTF-8, UTF-16, UTF-32, "Java escaped (u3213)", MIME, Base-64 can be are different encodings of the same coded character set. Same as decimal, hex, binary, octal, are different representations for the same number.

    For a long time the Unicode Consortium confused "coded character set" with "encoding", now they are starting to fix it.

  4. stu says:

    "Back in the days of MS-DOS, there was only one code page, namely, the code page that was provided by the original equipment manufacturer in the form of glyphs embedded in the character generator on the video card."

    Maybe in the really early days of MS-DOS, but by DOS 5 at least and probably before, there were loadable code pages.

  5. Cheong says:

    [quote]

    Maybe in the really early days of MS-DOS, but by DOS 5 at least and probably before, there were loadable code pages.

    [/quote]

    Yes. Seems "country.sys" and "nlsfunc" command is avaliable only after MSDOS 3.3.

  6. Matthew Lock says:

    Does anyone know why Japanese DOS uses the yen sign as the path separator rather than forward slash?

Comments are closed.


*DISCLAIMER: I DO NOT OWN THIS CONTENT. If you are the owner and would like it removed, please contact me. The content herein is an archived reproduction of entries from Raymond Chen's "Old New Thing" Blog (most recent link is here). It may have slight formatting modifications for consistency and to improve readability.

WHY DID I DUPLICATE THIS CONTENT HERE? Let me first say this site has never had anything to sell and has never shown ads of any kind. I have nothing monetarily to gain by duplicating content here. Because I had made my own local copy of this content throughout the years, for ease of using tools like grep, I decided to put it online after I discovered some of the original content previously and publicly available, had disappeared approximately early to mid 2019. At the same time, I present the content in an easily accessible theme-agnostic way.

The information provided by Raymond's blog is, for all practical purposes, more authoritative on Windows Development than Microsoft's own MSDN documentation and should be considered supplemental reading to that documentation. The wealth of missing details provided by this blog that Microsoft could not or did not document about Windows over the years is vital enough, many would agree an online "backup" of these details is a necessary endeavor. Specifics include:

<-- Back to Old New Thing Archive Index