At a recent lunch troika of MSJ
columnists (Paul DiLascia, John Robbins, and me), we were commenting on
how so few of today's programmers are skilled in what was essential
knowledge just a few years ago. For instance, we all agreed that many
programmers lack even a basic understanding of assembly language. In the
idealized world presented by most language vendors, coding is so easy
that there are no bugs to speak of. And if there ever was a bug, you'd
certainly be able to find it easily. No need to resort to messy
instruction-by-instruction code slogging, no sir.
Contrast
that utopian vision with your own experience. How many times have you
been in your debugger stepping through somebody else's code in assembly
language because there's no source available? This is especially
annoying when some third-party component blows up and you're assigned to
track down the problem. Even when debugging your own code, knowing a
little assembly language can help you figure out why your high-level
language code isn't working the way you think it should. Just put the
debugger into mixed source/assembly mode and observe how the compiler
translated your code into machine instructions.
Paul
DiLascia observed that there's a big difference between programming in
assembler and knowing just enough to get by in a pinch while debugging.
He jokingly suggested an "assembly language survival guide" that would
cover just enough to debug the most common situations. Sounds like a
darn good idea to me, so this column presents "Matt's Just Enough
Assembly Language to Get By." Think of it as a cram course in Intel x86
assembly language, with all of the esoteric stuff omitted. Afterward,
I'll show the assembler code for a typical procedure, and show how its
operations can be inferred by the instructions I've covered.
Before
jumping into the various instructions and instruction sequences, let me
add a couple of prefaces and warnings. First, I'm going to describe
only 32-bit Intel code. If you're still stuck programming in 16-bit
land, my sympathies. Second, different compilers from different vendors
generate different code. However, what I describe here should apply to
all compilers (including Visual Basic® 5.0 when generating native code.)
Third,
don't be surprised if you encounter instructions and instruction
sequences that aren't mentioned below. Most compilers use only a small
fraction of the instruction set available to them (at least on the Intel
platform). But many compilers support inlining of raw assembly
language. This allows assembly language gurus to use CPU instructions
that the compiler isn't aware of. An inline assembler may be used to
optimize a particular sequence, or it may be used to get at CPU-specific
instructions such as the timers available on Pentium-class CPUs. In
addition to inline assembly code, don't forget that programmers
sometimes write entire source modules in assembly language—hard to
believe, isn't it?
Just
as most 32-bit compilers use only a small fraction of the available
instructions, they also use only a subset of the registers of the CPU.
Since so much of what I'll describe depends on the registers, a quick
review of the commonly used Intel x86 register set is in order. In Figure 1,
all registers are 32 bits except where noted. "Multipurpose" means the
register can hold any arbitrary 32-bit value (for example, literal
values, addresses, and bit flags).
In
addition to being familiar with the registers, it's essential to
understand how instruction arguments are used. With the exception of a
few obscure cases, all instructions take zero, one, or two arguments.
Instructions that take zero or one arguments don't require explanation.
For instructions that take two arguments, the first argument is usually
the destination, while the second is the source. For example, the "ADD
EAX,ESI" instruction adds the contents of the ESI (the source) to EAX.
The result is stored in EAX (the destination). Put another way, the
first argument is the one that's modified as a result of the
instruction.
A
basic knowledge of how instructions reference memory is also vital. Some
instructions implicitly reference memory. For example, PUSH EAX pushes
the current value of the EAX register onto the stack. Where's the stack?
It's whatever the ESP register is currently pointing to. Likewise,
instructions like SCASB require that the ESI and/or EDI registers
contain the address of the memory location you want to use.
Other
instructions use arguments to explicitly state the address to be used.
You can usually tell this by the presence of square brackets in the
instruction. For example, "MOV EBX,[00401234]" reads from the address
0x00401234. Another form of addressing uses registers and possibly
offsets. For example, in "MOV EBX,[ECX]", the ECX register contains an
address (also known as a pointer by C++ users). The instruction "MOV
EBX,[EBP+8]" reads from the address calculated by adding 8 to the
contents of the EBP register.
Intel
CPUs have a very formal definition for allowable forms of instruction
addresses. It's complex enough to make most people's heads swim. If you
know what a modR/M byte is, or know how S-I-B addressing works, then you
already know more than this column can teach you. In the "Just Enough
to Get By" guide, the preceding paragraph should be enough.
With
the theory part over with, let's now look at the most common
instructions and instruction sequences. I've grouped them into several
categories rather than sorting them alphabetically. As you'll see, some
instructions are used in multiple categories.
Procedure Entry and Exit
These
instructions are automatically inserted by the compiler to create a
standard method for accessing parameters and local variables. This
method is called a stack frame, as in "frame of reference." In fact, the
Intel CPU dedicates the EBP register to maintaining a stack frame. For
this group of instructions, it's especially important to note that not
every procedure will use exactly the same sequence, and that certain
things may be omitted entirely.
Sequence PUSH EBP / MOV EBP,ESP / SUB ESP,XX
Purpose Sets up the EBP stack frame for a new procedure
Examples
PUSH EBP
MOV EBP, ESP
SUB ESP, 24
Description
"PUSH EBP" saves the previous frame pointer on the stack. "MOV EBP,ESP"
sets the EBP register to the same value as the stack pointer (ESP).
"SUB ESP,XX" creates space for local variables below the EBP frame.
In
optimized code, you may see this sequence interspersed with other
instructions (for example, "PUSH ESI"). Since "PUSH EBP" and "MOV
EBP,ESP" both use the EBP register, a processor with multiple pipelines
would ordinarily need to stall one of the pipelines. By interspersing
other instructions that don't use the EBP register, the processor can do
more work in the same amount of time.
Instruction ENTER
Purpose Sets up the EBP stack frame for a new procedure
Examples
ENTER 8, 0 ; Sets up stack frame with
; 8 bytes of local variables
Description
The ENTER instruction first became available on the 80286 processor. It
was intended to replace the "PUSH EBP / MOV EBP,ESP / SUB ESP,XX"
sequence with a single, smaller instruction. On current processors the
ENTER instruction is slower than the three-instruction sequence, so
ENTER is rarely used.
Sequence MOVE ESP,EBP / POP EBP
Purpose Removes the EBP stack frame before leaving a procedure
Description
The "MOV ESP,EBP" instruction bumps up the stack pointer past any space
allocated for local variables on the stack. "POP EBP" restores the
stack frame pointer to point at the previous EBP frame. This sequence is
normally followed by a return instruction to return control to the
calling procedure.
Instruction LEAVE
Purpose Removes the EBP stack frame before leaving
Description
The LEAVE instruction is the inverse of the ENTER instruction. It can
also be used to remove a frame set up by the "PUSH EBP / MOV EBP,ESP"
sequence. The LEAVE instruction is only 1 byte long, which is smaller
than the longer "MOV ESP,EBP / POP EBP" sequence. Unlike the ENTER
instruction, there's no performance penalty for using it, so some
compilers use LEAVE.
Instruction PUSH register
Purpose Saves the previous values of register variables
Examples
PUSH EBX
PUSH ESI
PUSH EDI
Description
Sometimes compilers use a general-purpose register to hold the value of
parameters or local variables. This can be more efficient than storing
the same value in memory. These are commonly known as register
variables. The EBX, ESI, and EDI registers are most often used as
register variables.
The
convention most compilers use is that register variable values are
preserved across procedure calls. If the compiler decides to use
register variables in a procedure, it is responsible for preserving the
value of the registers that it alters (typically, EBX, ESI, and EDI).
Typically, compilers preserve these register values on the stack as part
of setting up the procedure's stack frame. If the compiler uses only
one or two of the aforementioned registers, it needs to preserve only
those registers.
Instruction POP register
Purpose Restores the previous values of register variables
Examples
POP EDI
POP ESI
POP EBX
Description
In preparing to return from a procedure, the register variable
registers need to be restored to their previous values. These
instructions remove a value from the stack and place it into the
designated register.
Accessing Variables
The
Intel CPU has many instructions that work with variables, which are just
locations in memory. For example, you can add or subtract from a
variable representing a counter. Likewise, a variable may contain a
pointer to something. There are just too many instructions to describe
here, and in most cases the instruction name gives a good clue about
what the instruction is doing. However, I will show how variables of
different storage classes appear in assembly language.
Instruction instruction [global]
Purpose Global/static variables
Examples
MOV EAX,[00401234]
MOV [00401238],ESI
PUSH [77852432]
ADD [00620428],00001000
Description
When you see an instruction that includes an actual machine address
inside the square brackets, it's accessing memory that was declared as
either a global or static variable. These addresses are known at program
load time, so the instruction contains the actual memory address to
read or write.
Instruction instruction [parameter]
Purpose Procedure parameters and this pointers
Examples
MOV ESI,[EBP+14]
MOV [ESP+30],EAX
ADD [EBP+0C],2
OR [ESP+20],00000010
Description
Parameters to procedures are usually passed on the thread's stack.
Since these values are pushed before the procedure call and before the
called procedure sets up its stack frame, the parameters appear at
positive offsets from the stack frame base pointer (EBP). Just about any
instruction that makes reference to memory above EBP (for example,
"[EBP+8]") is making use of a procedure parameter. The advantage of
using EBP for accessing parameters is that EBP doesn't change throughout
the lifetime of a procedure. This makes it easier to keep track of the
procedure's parameters.
Prior
to the 80386, the only effective way to access parameters was with the
base pointer register. The 386 added the ability to access memory just
as easily with displacements from the stack pointer (ESP) register.
Thus, optimized code can dispense with setting up an EBP frame and still
reference parameters by using positive offsets from ESP. For example,
"ADD [ESP+20],4" adds four to whatever DWORD is at [ESP+20]. From a
debugging standpoint, using ESP to access parameters is inconvenient.
Since ESP can change during a procedure, a given parameter may be at
different offsets from ESP at different points in a procedure's code.
One
last word on parameters. In C++, the this pointer of a member function
is really a hidden parameter. Usually the this pointer is the last
parameter pushed on the stack before the call. In Visual Basic, the
self-referential me is the same thing as the C++ this pointer.
Instruction instruction [local]
Purpose Local Variables
Examples
MOV ESI,[EBP-14]
MOV [EBP-30],EAX
SUB [ESP],2
AND [ESP+4],00000010
Description
From the vantage point of an assembly instruction, local variables
aren't much different than parameters when an EBP frame is used. The
only distinction is that local variables are at negative offsets from
the EBP stack frame. You can get an idea of how big the sandbox for
local variables will be by examining the "SUB ESP,XX" instruction near
the beginning of the procedure.
Things
do get messy when the compiler decides to omit an EBP frame. When this
happens, the compiler addresses both local variables and parameters as
positive offsets from the ESP register. There's no good way to tell a
local apart from a parameter in this situation except to find out how
much space the procedure has allocated for locals (see above). If the
offset is less than the space allocated, it's a local. Otherwise, it's
probably a parameter.
Instruction LEA variable
Purpose Load Effective Address
Examples
LEA EAX,[ESP+14]
LEA EDX,[EBP-24]
Description
Despite the square brackets, LEA doesn't actually read memory or
dereference a pointer. Instead, it loads the first operand with an
address specified by the second parameter. For example, "LEA
EAX,[ESP+14]" takes the current value of the ESP register, adds 14 to
it, and puts the result in EAX.
LEA's
primary use is to obtain the address of local variables and parameters.
For example, in C++, if you use the & operator on a local variable
or parameter, the compiler will likely generate an LEA instruction. As
another example, "LEA EAX,[EBP-8]" loads EAX with the address of the
local variable at EBP-8.
A less
obvious use of LEA is as a fast multiplication. For example,
multiplying a value by 5 is relatively expensive. Using "LEA
EAX,[EAX*4+EAX]" turns out to be faster than the MUL instruction. The
LEA instruction uses hardwired address generation tables that makes
multiplying by a select set of numbers very fast (for example,
multiplying by 3, 5, and 9). Twisted, but true.
Calling Procedures
Instruction CALL location
Purpose Transfer control to another procedure
Examples
CALL 00682568
CALL [00401234]
CALL ESI
CALL [EAX+24]
Description
The CALL instruction doesn't need much explanation in itself. It pushes
the address of the instruction following it onto the stack, then
transfers control to the address given by the argument. The various ways
of specifying a target address are worth mentioning, however.
The
simplest form of the CALL instruction is when the argument contains the
destination address as an immediate value (for example, "CALL
00682568"). This type of call is almost always to another location
within the same module (EXE or DLL). Slightly more complicated is when
the CALL instruction indirects through an address (for example, "CALL
[00401234]"). You'll see this form of CALL instruction when calling a
function imported from another module. It's also seen when calling
through a function pointer stored in a global variable.
Two
other forms of CALL instruction use registers as part of their address.
If just a register name is specified (for example, "CALL ESI"), the CPU
transfers to whatever address is in the register. If a register is used
within brackets, perhaps with an additional displacement ("CALL
[EAX+24]"), the instruction is calling through a table of function
addresses. Where would these come from? You may know these tables by the
more familiar name of vtables. In the preceding instruction example,
the sixth member function is being called. (24 divided by the size of a
DWORD is 6.)
Instruction PUSH value
Purpose Places a parameter onto the stack in preparation for calling procedure
Examples
PUSH [00405234] ; Push a global variable
PUSH [EBP+C] ; Push a parameter
PUSH [EBP-14] ; Push a local variable
PUSH EAX ; Push whatever is in EAX
PUSH 12345678 ; Push an immediate value.
Description
When it comes to passing parameters, all variations of the PUSH
instruction are used by the compiler. Global variables, local variables,
parameters, the results of a calculation, and immediate values can all
be passed with a single instruction. When you see a sequence of PUSH
instructions prior to a CALL instruction, the odds are good that the
PUSHes are putting the parameters onto the stack.
As
mentioned earlier, if a member function or method is being called, the
this or me pointer is usually passed last. In some cases, the this
pointer is passed in the ECX register instead. You can identify when
this occurs by looking for code that initializes the ECX register and
then does nothing with it before the CALL instruction.
Instruction RET
Purpose Return from a procedure call
Examples
RET
RET 8
Description
The RET instruction returns from a procedure call. It simply pops
whatever value is currently at [ESP] into the EIP (instruction pointer)
register. The "RET XX" form does the same thing, and then adds XX to the
ESP value. This is how __stdcall procedures clear parameters off the
stack before returning to their caller. (Most Win32®
APIs are __stdcall based.) By dividing the number of cleared bytes by
four (the size of a DWORD), you can usually figure out how many
parameters a procedure takes. For instance, a procedure that returns
with a "RET 8" instruction takes two parameters.
Functions
that return an integer or pointer value usually return the value in the
EAX register. By examining what's in EAX before executing the RET
instruction, you can see the function's return value.
Instruction ADD ESP, value
Purpose Removes parameters off the stack
Examples
ADD ESP,24
Description
When calling procedures that don't remove parameters before returning,
it's up to the calling function to remove its parameters. This is the
case with cdecl functions, which is the default for C and C++ code. The
"ADD ESP,XX" function bumps up the stack pointer so that any passed
parameters are below the resulting ESP.
If the
function doesn't take a variable number of parameters, the "ADD ESP,XX"
instruction gives insight to how many parameters the called procedure
accepts. (See the description above for "RET XX".) If the called
procedure takes a variable number of parameters (like printf and
wsprintf do), the "ADD ESP,XX" instruction tells you how many parameters
were passed for that particular CALL.
Flow Control
In the
context of this column, flow control means code that affects which
portions of a program's code are subsequently executed. At the simplest
level, this means conditional execution (colloquially known as if
statements). More complex flow control sequences such as while loops and
for statements are usually built from the lower-level if statement
constructs. In one case though (the LOOP instruction), the processor has
built-in knowledge of these higher-level language constructs.
Before
I get to these instruction sequences, let me highlight two things that
can easily trip you up. For starters, the term "Jcc" is used as a
stand-in for any of the 16 conditional jump instructions. The cc means
condition code.
More
insidiously, there are several sets of Jcc instructions that are aliases
for one another. For example, JZ (Jump if Zero flag set) is the same
instruction as JE (Jump if Equal). Likewise, JNZ (Jump if Zero flag NOT
set) is the same instruction as JNE (Jump if Not Equal). Unfortunately,
some disassemblers use the JZ/JNZ form, while others use the JE/JNE
form. Is this confusing? Yes! The moral of the story: be prepared to
mentally substitute an aliased form of the instruction if it makes the
code easier
to understand.
Sequence CMP value, value / Jcc location
Purpose Compare two values, and branch accordingly
Examples
CMP EAX,2
JE 10036728
CMP [EBP+20],1000
JNE 00427824
Description
The CMP instruction is used when two values are to be compared. The CMP
instruction sets or clears a variety of flags, including the Zero,
Sign, and Overflow flags. From this, a variety of Jcc instructions can
then be used to branch accordingly. Most often, the JE and JNE
instructions follow a CMP instruction.
The following C++ code sequence would be implemented with a CMP / JNE sequence:
|