Using fibers to simplify enumerators, part 5: Composition

Date:January 4, 2005 / year-entry #2
Tags:code
Orig Link:https://blogs.msdn.microsoft.com/oldnewthing/20050104-00/?p=36813
Comments:    11
Summary:Another type of higher-order enumeration is composition, where one enumerator actually combines the results of multiple enumerators. (Everybody knows about derivation, but composition is another powerful concept in object-oriented programming. We've seen it before when building context menus.) In a producer-driven enumerator, you would implement composition by calling the two enumeration functions one after the...

Another type of higher-order enumeration is composition, where one enumerator actually combines the results of multiple enumerators. (Everybody knows about derivation, but composition is another powerful concept in object-oriented programming. We've seen it before when building context menus.)

In a producer-driven enumerator, you would implement composition by calling the two enumeration functions one after the other. In a consumer-driven enumerator, you would implement composition by wrapping the two enumerators inside a large enumerator which then chooses between the two based on which enumerator was currently active.

A fiber-based enumerator behaves more like a consumer-driven enumerator, again, with easier state management.

Let's write a composite enumerator that enumerates everything in the root of your C: drive (no subdirectories), plus everything in the current directory (including subdirectories).

class CompositeEnumerator : public FiberEnumerator {
public:
 CompositeEnumerator()
   : m_eFiltered(TEXT("C:\\"))
   , m_eCd(TEXT(".")) { }

 LPCTSTR GetCurDir()
    { return m_peCur->GetCurDir(); }
 LPCTSTR GetCurPath()
    { return m_peCur->GetCurPath(); }
 const WIN32_FIND_DATA* GetCurFindData()
    { return m_peCur->GetCurFindData(); }

private:
 void FiberProc();

private:
 FiberEnumerator* m_peCur;
 FilteredEnumerator m_eFiltered;
 DirectoryTreeEnumerator m_eCd;
};

void CompositeEnumerator::FiberProc()
{
 FEFOUND fef;

 m_peCur = &m_eFiltered;
 while ((fef = m_peCur->Next()) != FEF_DONE &&
        fef != FEF_LEAVEDIR) {
  m_peCur->SetResult(Produce(fef));
 }

 m_peCur = &m_eCd;
 while ((fef = m_peCur->Next()) != FEF_DONE) {
  m_peCur->SetResult(Produce(fef));
 }
}

Sidebar: Our composite enumeration is complicated by the fact that our FilteredEnumerator spits out a FEF_LEAVEDIR at the end, but which we want to suppress, so we have to check for it and eat it.

In the more common case where the enumerator is generating a flat list, it would be a simple matter of just forwarding the two enumerators one after the other. Something like this:

void CompositeEnumerator2::FiberProc()
{
 Enum(&m_eFiltered);
 Enum(&m_eCd);
}

void CompositeEnumerator2::Enum(FiberEnumerator *pe)
{
 m_peCur = pe;
 FEFOUND fef;
 while ((fef = m_peCur->Next()) != FEF_DONE) {
  m_peCur->SetResult(Produce(fef));
 }
}

End sidebar.

You can try out this CompositeEnumerator with the program you've been playing with for the past few days. Just change the line in main that creates the enumerator to the following:

 CompositeEnumerator e;

Exercise: Gosh, why is the total so unusually large?

Exercise: How many fibers are there in the program?

Exercise: Draw a diagram showing how control flows among the various fibers in this program.

Before you get all excited about fibers, consider the following:

  • Converting a thread to a fiber needs to be coordinated among all the components in the process so that it is converted only once and stays converted until everybody is finished. This means that if you are writing a plug-in that will go into some other process, you probably should avoid fibers, since you don't know what the other components in the process are going to do with fibers.

  • Fibers do not completely solve the one-thread-per-connection problem. They do reduce the context switching, but the memory footprint will still limit you to 2000 fibers per process (assuming a 2GB user-mode address space) since each fiber has a stack, which defaults to 1MB.

I think that's enough about fibers for now.


Comments (11)
  1. M Knight says:

    CreateFiberEx is what you would use if you dont want to give the fiber the default stack size. However, it doesnt look to be able to set the stack size for the 1st initial fiber you create.

  2. happy :-) says:

    Finally this boring series is over.

  3. rebuttal says:

    happy :-) you don’t have to read Raymond’s blog if you don’t want to, besides this boring series has some very good info in it that could be useful to many, many people that are happy to get it not happy to see it over with.

  4. mschaef says:

    This is one of the more useful series of posts lately. I’m thinking of implementing threads in a lisp interpreter and need precise control over the execution of threads (primarily to avoid having threads stomp all over my GC). It seems like fibers might be a good fit.

  5. Dave says:

    raymond, thanks for the info on fibers, this was a really informative series.

  6. Vorn says:

    What exactly do you mean by the "one-thread-per-connection problem"? Connection to what?

    Vorn

  7. Raymond Chen says:

    A quick google for <one thread per connection> yields many hits, among them http://www.cim.mcgill.ca/~franco/OpSys-304-427/lecture-notes/node68.html

  8. Vorn says:

    Oh that. Yes, I’ve run into that problem.

    (one of my first big projects was an IRCd, and I wrote a threadpool thingy for it. Threadpools are easy like whoa in Python, in particular because it’s got queue implementations that come with it that are designed specifically to be nice to threadpools)

    The link there doesn’t seem to like me, though.

    Vorn

  9. Cooney says:

    Isn’t the standard solution (when this one hits its limit) to use a fixed size pool of worker threads and have another thread dispatch work to them as available? This usually requires that you treat each connection like a state machine and also works best when the individual activities are fairly fine grained, but I haven’t seen anything that works better yet.

    As far as fibers are concerned, they’d probably work great with CreateFiberEx – 1MB is entirely more stack than you need most of the time, and 64k would easily allow 10k connections.

  10. Vorn says:

    It’s not necessarily another thread dispatching. It’s more often a data structure that makes threads wait until there’s something to do. Check out Python’s PriorityQueue module for an example.

    Vorn

  11. A while back there was an article in MSDN magazine about wrapping up the unmaged fibers API to implement…

Comments are closed.


*DISCLAIMER: I DO NOT OWN THIS CONTENT. If you are the owner and would like it removed, please contact me. The content herein is an archived reproduction of entries from Raymond Chen's "Old New Thing" Blog (most recent link is here). It may have slight formatting modifications for consistency and to improve readability.

WHY DID I DUPLICATE THIS CONTENT HERE? Let me first say this site has never had anything to sell and has never shown ads of any kind. I have nothing monetarily to gain by duplicating content here. Because I had made my own local copy of this content throughout the years, for ease of using tools like grep, I decided to put it online after I discovered some of the original content previously and publicly available, had disappeared approximately early to mid 2019. At the same time, I present the content in an easily accessible theme-agnostic way.

The information provided by Raymond's blog is, for all practical purposes, more authoritative on Windows Development than Microsoft's own MSDN documentation and should be considered supplemental reading to that documentation. The wealth of missing details provided by this blog that Microsoft could not or did not document about Windows over the years is vital enough, many would agree an online "backup" of these details is a necessary endeavor. Specifics include:

<-- Back to Old New Thing Archive Index