Solving this next problem should be a snap with your
nascent psychic powers:
I'm trying use FormatMessage
to load a resource string with one insertion in it,
and this doesn't work for some reason.
The string is
"Blah blah blah %1. Blah blah blah."
The call to FormatMessage
fails,
and GetLastError()
returns
ERROR_RESOURCE_TYPE_NOT_FOUND
.
What am I doing wrong?
LPTSTR pszInsertion = TEXT("Sample");
LPTSTR pszResult;
FormatMessage(
FORMAT_MESSAGE_ALLOCATE_BUFFER |
FORMAT_MESSAGE_FROM_HMODULE |
FORMAT_MESSAGE_ARGUMENT_ARRAY,
//I also tried an instance handle and NULL.
GetModuleHandle(NULL),
IDS_MY_CUSTOM_MESSAGE,
MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT), // default language
(LPTSTR) &pszResult,
0,
(va_list*) &pszInsertion);
Hint: Take a closer look at the parameter
IDS_MY_CUSTOM_MESSAGE
.
Hint 2: What does "IDS_
" tell you?
Resource identifiers that begin with "IDS_
"
are typically string resource identifiers, not message resource
identifiers.
There is no strong consensus on the naming convention for
message resource identifiers,
although I've seen "MSG_
".
Part of the reason why there is no strong consensus on the naming
convention for message resource identifiers is that almost nobody
uses message resources!
I don't understand why they were added to Win32, since there
was already a way of embedding strings in resources,
namely, string resources.
That's why you're getting ERROR_RESOURCE_TYPE_NOT_FOUND
.
There is no message resource in your module.
If you're not going to use a message resource, you'll have to
use the FORMAT_MESSAGE_FROM_STRING
flag and
pass the format string explicitly.
DWORD_PTR rgdwInsertions[1] = { (DWORD_PTR)TEXT("Sample") };
TCHAR szFormat[256];
LoadString(hInstance, IDS_MY_CUSTOM_MESSAGE, szFormat, 256);
LPTSTR pszResult;
FormatMessage(
FORMAT_MESSAGE_ALLOCATE_BUFFER |
FORMAT_MESSAGE_FROM_STRING |
FORMAT_MESSAGE_ARGUMENT_ARRAY,
szFormat,
0,
0,
(LPTSTR) &pszResult,
0,
(va_list*) &rgdwInsertions);
I also made a slight change to the final parameter.
When you use FORMAT_MESSAGE_ARGUMENT_ARRAY
,
the last parameter must be an array of DWORD_PTR
s.
(The parameter must be cast to va_list*
to keep
the compiler happy.)
It so happens that the original code got away with this mistake
since sizeof(DWORD_PTR) == sizeof(LPTSTR)
and they
both have the same alignment requirements.
On the other hand, if the insertion were a DWORD
,
passing (va_list*)&dwValue
is definitely wrong
and can crash if you're sufficiently unlucky.
(Determining the conditions under which your luck runs out
is left as an exercise.)
The kernel guys use structured 32-bit values for all status codes, and there is a strong need for a (localizable) message for every status code.
"Win32 resources" are not an exact fit this requirement, for a number of reasons. A more natural (IMO) approach is to have a single source file that defines the numeric codes and the message strings associated with same.
I suspect that the best reasons for this, though, are
1) They kernel guys came from the VMS culture where use of a message source file/message compiler was the system norm.
2) The kernel, and subsystem, use of same was in existence long before there was a Win32, i.e. back when the OS was "NT OS/2".
This doesn’t of course give any direct reason why Win32 had to adopt the same convention as the kernel.
Yeah…What dave said.
Why is the last arg explicitly typed versus the old “…” indicator for a variable argument list ?
Is it a code-safety related change (so you don’t pass garbage, or forget a few params that cause garbage to be read off the stack, giving a potential attack vector), a MS preferred implementation manner, a language standard change or __ ?
Strings in message tables are assumed to be in the "current ANSI codepage", whatever that means, and for me that alone is almost incentive enough to never use them.
Strings in string tables are UTF-16, but they are, curiously, packed in bundles of 16 per resource, they are not NUL-terminated and have, instead, a prefix USHORT with the character count (basically making them wide-character Pascal strings). Whoever was responsible for that design sure had to be very proud of it.
Win32 doesn’t have a standard counted string type, so they cannot be used directly from memory and need to be copied instead. Word on the street is LoadString lets you access the original pointer+size in the resource section as an undocumented feature – figure it out by yourself. Personally, I prefer a combination of FindResource/LoadResource/LockResource and manually parsing the string bundle.
For some other curious reason, LoadString and wvsprintf are implemented in user32.dll rather than kernel32.dll, pulling in user32.dll (and gdi32.dll, and related kernel mode overhead) unnecessarily from many otherwise minimalistic libraries.
Raymond: I believe DWORD_PTRs are supposed to be equivalent to pointers in everything (size, alignment) but semantics.
<<Strings in message tables are assumed to be in the "current ANSI codepage">>
No, they are not.
Just compile a .mc file (using mc.exe) and take a look inside the resulting .bin file.
> Better inconvenient than impossible. -Raymond
The “standard” solution to that is to provide two functions, one that takes a … and another that takes a va_list. See, for instance, fprintf and vfprintf.
(But since the … version merely forwards to the va_list version, with some va_start/va_end wrapping, it’s not like it’d be hard for users to write their own wrapper for FormatMessage. It might be nice if there was a standard wrapper, but oh well.)
Re:
Maybe he meant that, by default, the .mc source file is assumed to be in the current ANSI code page. Indeed it is.
But, actually, mc gives you "Unicode or ANSI" control over both input (-a/-u) and output (-A/-U), with -a -U being the default.
Possibly such control is a relatively new feature; I dunno.
I think the question is valid.
Given that the function name is FormatMessage, you kind of expect the word MESSAGE to be in some of the flags that are passed to it. In particular, FORMAT_MESSAGE_FROM_HMODULE sounds like a request to load the "message to be formatted" from a resource. The person who asked the question probably didn’t know about the existence of "message resources". Similarly, FORMAT_MESSAGE_FROM_STRING, when compared to FORMAT_MESSAGE_FROM_HMODULE, sounds like the function takes a pointer instead of a resource ID.
Better names would be
FORMAT_MESSAGE_FROM_STRING_RESOURCE and
FORMAT_MESSAGE_FROM_MESSAGE_RESOURCE
The bigger problem with FormatMessage is that it takes too many options. In the number of lines the guy spent calling FormatMessage, he could open a text file, scan down to the line containing his message ID, and rip out the string in question. Localizing the file in question would then be a job for the installer, which seems like the right level at which solve the problem. Such text files can also be worked on by the less technically inclined localization guys (as compared to .rc files, which can break the build).
And again, in order to make life easier for the programmer, why deal with these IDs at all? String IDs just tend to repeat, in a mangled form, the contents of the message. A solution I’ve seen successfully used involved simply wrapping each string in the source that’s supposed to be localized with a function call:
LocalizeString("some text")
A command-line tool then scanned the entire source tree for occurences of the pattern LocalizeString(c++ string). All these strings were saved to the messages file, to which the localization engineer would add translations:
en: some text
ru: рыба
LocalizeString then looked things up in this file. The engineers got readability out of this, and saved one identifier per message.
Orphan messages were also eliminated. Isn’t this a big problem? Any given Windows program that uses string resources probably has 20% of "garbage" strings (strings that are no longer used).
"I think the question is valid."
Who said it wasn’t?
“Who said it wasn’t?”
The sarcastic mention of psychic powers required to solve the problem is another way of saying “it doesn’t take a wizard to figure this out”. So the person asking the question must be step below plebeian.
If you asked me a question and I offered you to engage your “nascent psychic powers”, wouldn’t you feel put down?
"As indeed it does"
Right… So basically FormatMessage can’t access string resources, for which LoadString should be used instead.
Mihai: turns out we are both right. Some tables contain ANSI strings (see: ntoskrnl.exe) but most have UTF-16 strings. It makes sense that the symbolic names for bugcheck codes (which is what ntoskrnl.exe’s message table contains) would be ANSI, since the kernel debugging API doesn’t support Unicode. I guess there’s a flag for it in the resource format
>Similarly, FORMAT_MESSAGE_FROM_STRING,
>when compared to FORMAT_MESSAGE_FROM_HMODULE,
>sounds like the function takes a pointer
>instead of a resource ID.
As indeed it does.
The doc:
The lpSource parameter is a pointer to a null-terminated message definition. The message definition may contain insert sequences, just as the message text in a message table resource may.
/* Here, "message definition" means "array of characters". */
and:
dwMessageId
[in] Message identifier for the requested message. This parameter is ignored if dwFlags includes FORMAT_MESSAGE_FROM_STRING.
Alexei: in my experience psychic debugging involves someone making a mistake which you are able to guess because it’s a common mistake, and maybe a mistake you’ve made yourself in the past.
It doesn’t mean that the person asking the question is an idiot for not knowing the answer.
PingBack from http://www.shugye.com/2007/05/formatmessage-34.html
Well, there’s always "try it and see". But that’s perhaps too difficult?
I don’t understand why you think this is such a big complicated deal. It’s obvious to me that "FROM_STRING" means "from a string". Perhaps I’m way too literal-minded.
> If you’re not going to use a message resource,
> you’ll have to use the FORMAT_MESSAGE_FROM_STRING
> flag and pass the format string explicitly.
I believe you, but look at this:
* FORMAT_MESSAGE_FROM_STRING
* The lpSource parameter is a pointer to a
* null-terminated message definition.
That sure looks like lpSource should point to a message resource. Even though a message is a string, MSDN calls for a message definition. If someone didn’t guess the difference between FORMAT_MESSAGE_FROM_HMODULE and FORMAT_MESSAGE_FROM_STRING, your blog is the only way they can find out.
Tuesday, May 29, 2007 3:36 PM by Alexei Lebedev
> In the number of lines the guy spent calling
> FormatMessage, he could open a text file, scan
> down to the line containing his message ID,
> and rip out the string in question.
Thereby making execution very inefficient on every target machine every time it gets executed, instead of once while trying to figure out how to code the function call. If all the lines of parameters were understandable then it would be better to accept the number of parameters.
> String IDs just tend to repeat, in a mangled
> form, the contents of the message.
Not always. Thank you for helping provide a counterexample.
Contents of the message: рыба
ID: Not ID_рыба
> LocalizeString then looked things up in this file.
And that’s why, for example, a vendor’s web page shows a list of products, in which the left hand column (except for the top row) is an integer starting at 1 and counting up, and the left hand column’s header (top row) is a word which means the opposite of "Yes". The vendor started with a word that means the opposite of "Yes" in one or two languages, and localization took that meaning, instead of finding localizations of a different meaning of that word.
Norman Diamond: it’s called a "message definition" and not a "message" because you pass a formatting string with argument placeholders. The input string is a "message definition", the "message" is the final output
At KJK::Hyperion:
You wrote:
"Win32 doesn’t have a standard counted string type,"
Yes, that is right. But the kernel does have it: As ANSI_STRING as well as UNICODE_STRING. In fact, these are most often used for strings there.
BSTR’s are a part of win32 and they are counted
ac: see what JustMe said. BSTRs have their own Very Special allocator (requiring, in this case, to copy the string anyway), you can’t use them to refer to a string in an arbitrary range of memory
> almost nobody uses message resources
Except for those who write to the event log.
Thursday, May 31, 2007 8:17 AM by Jonathan
Either that, or including those who write to the event log ^_^
For a few years I was confused by a ton of event log messages talking about not having resources for remote computers. Then one day I wanted to add some debugging traces to a program, but didn’t want to spend a few days figuring out how to obey MSDN’s rules just to record debugging traces, so I recorded strings the same way I used to do with printk. Oh, so that’s where all those log messages came from, talking about not having resources for remote computers. When a very small software company made <deleted> around 7 years ago, they were as lazy as I was.
PingBack from http://www.itsatrap.info/the-perils-of-translating-words-blindly-without-verifying-them-in-context-2/