Date: | December 29, 2004 / year-entry #438 |
Tags: | code |
Orig Link: | https://blogs.msdn.microsoft.com/oldnewthing/20041229-00/?p=36853 |
Comments: | 28 |
Summary: | The COM model for enumeration (enumeration objects) is biased towards making life easy for the consumer and hard for the producer. The enumeration object (producer) needs to be structured as a state machine, which can be quite onerous for complicated enumerators, for example, tree walking or composite enumeration. On the other hand, the callback model... |
The COM model for enumeration (enumeration objects) is biased towards making life easy for the consumer and hard for the producer. The enumeration object (producer) needs to be structured as a state machine, which can be quite onerous for complicated enumerators, for example, tree walking or composite enumeration. On the other hand, the callback model for producer (used by most Win32 functions) is biased towards making life easy for the enumerator and hard for the consumer. This time, it is the consumer that needs to be structured as a state machine, which is more work if the consumer is doing something complicated with each callback. (And even if not, you have to create a context structure to pass state from the caller, through the enumerator, to the callback.) For example, suppose we want to write a routine that walks a directory structure, allowing the caller to specify what to do at each decision point. Let's design this first using the callback paradigm: #include <windows.h> #include <shlwapi.h> #include <stdio.h> enum FERESULT { FER_CONTINUE, // continue enumerating // (if directory: recurse into it) FER_SKIP, // skip this file/directory FER_STOP, // stop enumerating }; enum FEOPERATION { FEO_FILE, // found a file FEO_DIR, // found a directory FEO_LEAVEDIR, // leaving a directory }; typedef FERESULT (CALLBACK *FILEENUMCALLBACK) (FEOPERATION feo, LPCTSTR pszDir, LPCTSTR pszPath, const WIN32_FIND_DATA* pwfd, void *pvContext); FERESULT EnumDirectoryTree(LPCTSTR pszDir, FILEENUMCALLBACK pfnCB, void* pvContext);
The design here is that the caller calls
Designing this as a callback makes life much simpler
for the implementation of FERESULT EnumDirectoryTree( LPCTSTR pszDir, FILEENUMCALLBACK pfnCB, void *pvContext) { FERESULT fer = FER_CONTINUE; TCHAR szPath[MAX_PATH]; if (PathCombine(szPath, pszDir, TEXT("*.*"))) { WIN32_FIND_DATA wfd; HANDLE hfind = FindFirstFile(szPath, &wfd); if (hfind != INVALID_HANDLE_VALUE) { do { if (lstrcmp(wfd.cFileName, TEXT(".")) != 0 && lstrcmp(wfd.cFileName, TEXT("..")) != 0 && PathCombine(szPath, pszDir, wfd.cFileName)) { FEOPERATION feo = (wfd.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) ? FEO_DIR : FEO_FILE; fer = pfnCB(feo, pszDir, szPath, &wfd, pvContext); if (fer == FER_CONTINUE) { if (feo == FEO_DIR) { fer = EnumDirectoryTree(szPath, pfnCB, pvContext); if (fer == FER_CONTINUE) { fer = pfnCB(FEO_LEAVEDIR, pszDir, szPath, &wfd, pvContext); } } } else if (fer == FER_SKIP) { fer = FER_CONTINUE; } } } while (FindNextFile(hfind, &wfd)); FindClose(hfind); } } return fer; }
Note: I made no attempt to make this function at all efficient
since that's not my point here.
It's highly wasteful of stack space (which can cause problems
when walking deep directory trees).
This function also doesn't like paths deeper than Well, that wasn't so hard to write. But that's because we made life hard for the consumer. The consumer needs to maintain state across each callback. For example, suppose you wanted to build a list of directories and their sizes (both including and excluding subdirectories). class EnumState { public: EnumState() : m_pdirCur(new Directory(NULL)) { } ~EnumState() { Dispose(); } FERESULT Callback(FEOPERATION feo, LPCTSTR pszDir, LPCTSTR pszPath, const WIN32_FIND_DATA* pwfd); void FinishDir(LPCTSTR pszDir); private: struct Directory { Directory(Directory* pdirParent) : m_pdirParent(pdirParent) , m_ullSizeSelf(0) , m_ullSizeAll(0) { } Directory* m_pdirParent; ULONGLONG m_ullSizeSelf; ULONGLONG m_ullSizeAll; }; Directory* Push(); void Pop(); void Dispose(); Directory* m_pdirCur; }; EnumState::Directory* EnumState::Push() { Directory* pdir = new Directory(m_pdirCur); if (pdir) { m_pdirCur = pdir; } return pdir; } void EnumState::Pop() { Directory* pdir = m_pdirCur->m_pdirParent; delete m_pdirCur; m_pdirCur = pdir; } void EnumState::Dispose() { while (m_pdirCur) { Pop(); } } void EnumState::FinishDir(LPCTSTR pszDir) { m_pdirCur->m_ullSizeAll += m_pdirCur->m_ullSizeSelf; printf("Size of %s is %I64d (%I64d)\n", pszDir, m_pdirCur->m_ullSizeSelf, m_pdirCur->m_ullSizeAll); } ULONGLONG FileSize(const WIN32_FIND_DATA *pwfd) { return ((ULONGLONG)pwfd->nFileSizeHigh << 32) + pwfd->nFileSizeLow; } FERESULT EnumState::Callback(FEOPERATION feo, LPCTSTR pszDir, LPCTSTR pszPath, const WIN32_FIND_DATA* pwfd) { if (!m_pdirCur) return FER_STOP; switch (feo) { case FEO_FILE: m_pdirCur->m_ullSizeSelf += FileSize(pwfd); return FER_CONTINUE; case FEO_DIR: if (Push()) { return FER_CONTINUE; } else { return FER_SKIP; } case FEO_LEAVEDIR: FinishDir(pszPath); /* Propagate size into parent */ m_pdirCur->m_pdirParent->m_ullSizeAll += m_pdirCur->m_ullSizeAll; Pop(); return FER_CONTINUE; default: return FER_CONTINUE; } /* notreached */ } FERESULT CALLBACK EnumState_Callback( FEOPERATION feo, LPCTSTR pszDir, LPCTSTR pszPath, const WIN32_FIND_DATA* pwfd, void* pvContext) { EnumState* pstate = reinterpret_cast<EnumState*>(pvContext); return pstate->Callback(feo, pszDir, pszPath, pwfd); } int __cdecl main(int argc, char **argv) { EnumState state; if (EnumDirectoryTree(TEXT("."), EnumState_Callback, &state) == FER_CONTINUE) { state.FinishDir(TEXT(".")); } return 0; }
Boy that sure was an awful lot of typing, and what's
worse, the whole structure of the program has
been obscured by the explicit state management.
It sure is hard to tell at a glance what this
chunk of code is trying to do.
Instead, you have to stare at the (Yes, I could have simplified this code a little by using a built-in stack class, but as I have already noted in the context of smart pointers, I try to present these articles in "pure" C++ so people won't get into arguments about which class library is best.)
Tomorrow, we'll look at how the world would be
if the function |
Comments (28)
Comments are closed. |
It may be the part of the standard library, but it’s still a library. Some people may like std::string, others might prefer CString, still others may prefer their own string library. (If everybody agreed on std::string then why are there so many string libraries?)
Perhaps if more people treated the standard library as a standard, more people would accept it as such. In any case, I don’t think too many people would lampoon you for using it.
I always find this sort of mix of c-like code and c++ code a bit jarring, but then I’m not really a windows programmer. Looking forward to tomorrow’s installment!
True, but if I chose a specific library (even the standard one), I would lose people who weren’t familiar with (or actively disliked) that library. But everybody who programs Windows knows what TCHAR is.
Maybe it’s because I’ve been reading a lot of LISP code lately, but I looked at this article and my first thought was "Why is Raymond posting randomly-indented assembler code on his blog?"
C++: Just Say AAAARRGHH OH GOD STOP AAAAARRRGGGHHH HELP PLEASE PLEASE PLEASE UUURKKK!
What does fibers have to do with this?
Drazen,
There are lots of reasons not to use std::string, and CString/CAtlString often provide an alternative to it.
Just because STL’s a part of the C++ standard library doesn’t mean that it is a worthy solution for everyone – the code size bloat you get from using STL is enough to preclude its use in many scenarios (like most of the Windows core). Also the fact that STL’s localization support is mediochre (it has to be because it’s platform neutral, and localization basically requires a platform specific infrastructure (how you specify your localizable resources depends on the platform)) precludes its use in many places.
Noob, this is a multi-part lecture. I’m sure Rand will get to that. Part 1 is setting a stage.
Noob,
Because state is easily held on the CPU stack. A fiber is a bit like a thread in that it has its own stack but the context switching is driven by the application rather than the scheduler.
So the idea here is to have a tasking module without the tasking overhead (or part of it, stack space is overhead): two "fibers", one a consumer, one a producer.
Think of fibers as setjmp()/longjmp() with separate stacks.
IIRC, there is a "fiber like" construct in later versions of VMS. I wonder if they are a Cutler decree or just a common good idea.
Speaking as the owner of a technology (namely, linguistic collation) and a colleague to the owner of a technology (namely encoding / codepages) and a former owner of a technology (namely, locale based formatting) which is superior in almost every way to the correponding CRT analogue, I can tell you that there are indeed times that the CRT is not always the best solution. And there do exist many times that it is not the answer….
This is OT and for that I apologise . . .
Does anyone know where I can find information about the next standard version of C++? I remember hearing about this somewhere but have no idea what it is codenamed etcetera.
Nice article Raymond (as per usual).
std::auto_ptr is a perfect example of something that just because it is the standard doesn’t mean it is good.
Kent:
You can find some links about the next C++ standard here: http://www.gotw.ca/iso/
And also check out Herb Sutter’s blog at http://www.pluralsight.com/blogs/hsutter/default.aspx
Happy new year everyone :)
Perhaps true, but when I see std:: stuff I tend to get more confused rather than less.
This time, we’ll watch the enumerator lose.
The std:: stuff assumes you are comfortable with exceptions. I personally believe that there are unsolvable quality problems associated with use of exceptions. Thus, even though the library seems very useful, I stay as far away from the STL libraries as possible.
C and C++ (pre-STL) were systems programming languages which provided you with a syntax for using the underlying machine. The libraries provided were really just tokens to portability.
With the introduction of STL, C++ is moving/has moved out of the systems programming space and is trying to seem like an application programming language like BASIC. I personally don’t see the point. If I wanted to use BASIC, with all its plusses and minuses, I’d use BASIC.
"an application programming language like BASIC"
I see Microsoft doesn’t do any drug testing…
Raymond: You need to cast the ULONGLONGs to unsigned __int64 when you pass it in to printf and you need the format to be %I64u because %I64d is the signed version.
is part of the software industry. if you do not learn to learn (ya ya ya) you will be out of a job as fast as you can say "hello world!"
CSci 101 Part I
A while back there was an article in MSDN magazine about wrapping up the unmaged fibers API to implement…
PingBack from http://www.labraaten.com/wpblog/2007/11/29/mimic-the-c-yield-instruction-in-vc/