The
180 in front of each symbol name indicates that the symbol (for
instance, _DumpCAP@0) can be found in an OBJ file beginning 0x180 bytes
into the library. As you can see, PENTER.LIB only has one OBJ in it.
More complicated LIB files will have multiple OBJs, so the offsets
preceding the symbol names will be different.
Unlike
OBJs passed on the command line, the linker does not have to include
every OBJ in a library into the final executable. Quite the opposite, in
fact. The linker won't include any OBJ code or data from a library OBJ
unless there's a reference to at least one symbol from that OBJ. Put
another way, explicitly named OBJs on the linker command line fly first
class, and are always included in the executable. OBJs from LIB files
fly standby, and are only included in the executable if referenced.
A
symbol in a library can be referenced (and hence, its OBJ included) in
three ways. First, there can be a direct reference to a symbol from one
of the explicit OBJ files. For example, if I were to call the C++ printf
function from a source file I wrote, there would be a reference (and a
fixup) generated for it in my OBJ file. When creating the executable,
the linker would search its LIB files for the OBJ containing the printf
code, and include the OBJ it finds.
Second,
there can be an indirect reference. Indirect means an OBJ included via
the first method contains references to symbols in yet another OBJ file
in the library. This second OBJ may in turn reference symbols in a third
OBJ file in the library. One of the linker's toughest jobs is to track
down and include every OBJ that has a referenced symbol, even if that
symbol is located via 49 levels of indirection.
When
looking for a symbol, the linker searches the LIB files in the order it
encountered them on the command line. However, once a symbol is found
in a library, that library becomes the preferred library, and is given
first crack at all future symbols. The library loses its favored status
once a symbol isn't found in the library. In this case, the next library
in the linker list is searched. (For a more technically detailed
description, see the Microsoft Knowledge Base article Q31998.)
Let's
now address the issue of import libraries. Structurally, import
libraries are no different than regular libraries. When resolving
symbols, the linker doesn't know the difference between an import
library and a regular library. The key difference is that there's no
compilation unit (for example, source file) that corresponds to each OBJ
in the import library. Instead, the linker itself produces the import
library, based upon the symbols that are exported from an executable
being built. Put another way, when the linker creates the exports table
in an executable, it also creates the corresponding import library to
reference those symbols. This point leads nicely to my next topic, the
imports table.
Creating the Imports Table
One
of the most fundamental features that Win32 rests upon is the ability
to import functions from other executables. All of the information about
the imported DLLs and functions resides in a table in the executable
known as the imports table. When it's in a section all by itself, this
section is named .idata.
Since
imports are so vital to Win32 executables, it may seem strange that the
linker doesn't have any special knowledge of import tables. Put another
way, the linker doesn't know or care whether a function you've called
resides in another DLL, or within the same executable. The way that this
is accomplished is all very clever. By simply following the section
combining and symbol resolution rules described above, the linker
creates the imports table, seemingly unaware of the special significance
of the table.
Let's look at some fragments from an import library to see how the linker accomplishes this feat. Figure 2
shows portions of running DUMPBIN on the USER32.LIB import library.
Pretend that you've called ActivateKeyboardLayout API. A fixup record
for _ActivateKeyboardLayout@8 can be found in your OBJ file. From the
USER32.LIB header, the linker determines that this function can be found
in the OBJ at offset 0xEA14 in the file. At this point, the linker is
committed to including the contents of this OBJ in the finished
executable (see Figure 3).
From Figure 3,
you can see that a variety of sections from the OBJ will be brought in,
including .text, .idata$5, .idata$4, and .idata$6. In the text section
is a JMP instruction (the 0xFF 0x25 opcode). From the COFF symbol table
at the end of Figure 3,
you can see that _ActivateKeyboardLayout@8 resolves to this JMP
instruction in the .text section. Thus, the linker hooks up your CALL to
ActivateKeyboardLayout to the JMP instruction in the .text section of
the import library's OBJ.
The
linker combines the .idata$XXX sections into a single .idata section in
the executable. Now recall that the linker has to follow the rule for
combining sections with a $ in their name. If there are other imported
functions brought in from USER32.LIB, their .idata$4, .idata$5 and
.idata$6 sections will also be thrown into the mix. The net result is
that all the .idata$4 sections create one array, while all the .idata$5
sections create another array. If you're familiar with the term "import
address table," this process is how that table is created.
Finally,
notice that the raw data for the .idata$6 section contains the string
ActivateKeyboardLayout. This is how the name of imported functions make
it into the import address table. The important point is that creating
the import table isn't a big deal for the linker. It's just doing its
job, following the rules I described earlier.
Creating the Exports Table
Besides
creating an import table for executables, a linker is also responsible
for creating the opposite: the exports table. Here, the linker's job is
both harder and easier. In pass one, the linker has the task of
collecting information about all the exported symbols and creating an
exported function table. During the first pass, the linker creates the
export table and writes it to a section called .edata in an OBJ file.
This OBJ file is standard in all respects, except that it uses an
extension of .EXP rather than .OBJ. That's right, you can use DUMPBIN to
examine the contents of those EXP files that seem to accumulate in the
presence of DLLs that you build.
During
its second pass, the linker's job is almost trivial. It simply treats
the EXP as a regular OBJ file. This in turn means that the .edata in the
OBJ will be included in the executable. Sure enough, if you see an
.edata section in an executable, it's the export table. These days,
though, finding an .edata section is increasingly rare. It seems that if
the executable uses the Win32 console or GUI subsystems, the linker
automatically merges the .edata section with the .rdata section, if one
is present.
Wrap Up
Obviously,
a linker has many more jobs than I've described here. For example,
producing certain types of debug information (such as CodeView info) is a
major piece of a linker's total work. However, creating debug
information isn't an absolutely mandatory job for the linker, so I
haven't spent any time describing it. Likewise, a linker should be able
to create a MAP file listing the public symbols that were included in
the executable, but again it's not a mandatory function of a linker.
While
I've covered a lot of complex ground, at its heart a linker is simply a
tool for combining multiple compilation units into a functioning
executable. The first cornerstone is in combining sections; the second
is in resolving references (fixups) between the combined sections. Throw
in a dash of knowledge about system-specific data structures such as
the exports table, and you've covered the basics of this powerful and
essential tool.
Have a question about programming in Windows? Send it to Matt at mpietrek@tiac.com
|