The history of calling conventions, part 5: amd64

Date:January 14, 2004 / year-entry #18
Tags:history
Orig Link:https://blogs.msdn.microsoft.com/oldnewthing/20040114-00/?p=41053
Comments:    32
Summary:The last architecture I'm going to cover in this series is the AMD64 architecture (also known as x86-64). The AMD64 takes the traditional x86 and expands the registers to 64 bits, naming them rax, rbx, etc. It also adds eight more general purpose registers, named simply R8 through R15. The first four parameters to a...

The last architecture I'm going to cover in this series is the AMD64 architecture (also known as x86-64).

The AMD64 takes the traditional x86 and expands the registers to 64 bits, naming them rax, rbx, etc. It also adds eight more general purpose registers, named simply R8 through R15.

  • The first four parameters to a function are passed in rcx, rdx, r8 and r9. Any further parameters are pushed on the stack. Furthermore, space for the register parameters is reserved on the stack, in case the called function wants to spill them; this is important if the function is variadic.

  • Parameters that are smaller than 64 bits are not zero-extended; the upper bits are garbage, so remember to zero them explicitly if you need to. Parameters that are larger than 64 bits are passed by address.

  • The return value is placed in rax. If the return value is larger than 64 bits, then a secret first parameter is passed which contains the address where the return value should be stored.

  • All registers must be preserved across the call, except for rax, rcx, rdx, r8, r9, r10, and r11, which are scratch.

  • The callee does not clean the stack. It is the caller's job to clean the stack.

  • The stack must be kept 16-byte aligned. Since the "call" instruction pushes an 8-byte return address, this means that every non-leaf function is going to adjust the stack by a value of the form 16n+8 in order to restore 16-byte alignment.

Here's a sample:

void SomeFunction(int a, int b, int c, int d, int e);
void CallThatFunction()
{
    SomeFunction(1, 2, 3, 4, 5);
    SomeFunction(6, 7, 8, 9, 10);
}

On entry to CallThatFunction, the stack looks like this:

xxxxxxx0 .. rest of stack ..
xxxxxxx8 return address <- RSP

Due to the presence of the return address, the stack is misaligned. CallThatFunction sets up its stack frame, which might go like this:

    sub    rsp, 0x28

Notice that the local stack frame size is 16n+8, so that the result is a realigned stack.

xxxxxxx0 .. rest of stack ..
xxxxxxx8 return address
xxxxxxx0   (arg5)
xxxxxxx8   (arg4 spill)
xxxxxxx0   (arg3 spill)
xxxxxxx8   (arg2 spill)
xxxxxxx0   (arg1 spill) <- RSP

Now we can set up for the first call:

        mov     dword ptr [rsp+0x20], 5     ; output parameter 5
        mov     r9d, 4                      ; output parameter 4
        mov     r8d, 3                      ; output parameter 3
        mov     edx, 2                      ; output parameter 2
        mov     ecx, 1                      ; output parameter 1
        call    SomeFunction                ; Go Speed Racer!

When SomeFunction returns, the stack is not cleaned, so it still looks like it did above. To issue the second call, then, we just shove the new values into the space we already reserved:

        mov     dword ptr [rsp+0x20], 10    ; output parameter 5
        mov     r9d, 9                      ; output parameter 4
        mov     r8d, 8                      ; output parameter 3
        mov     edx, 7                      ; output parameter 2
        mov     ecx, 6                      ; output parameter 1
        call    SomeFunction                ; Go Speed Racer!

CallThatFunction is now finished and can clean its stack and return.

        add     rsp, 0x28
        ret

Notice that you see very few "push" instructions in amd64 code, since the paradigm is for the caller to reserve parameter space and keep re-using it.

[Updated 11:00am: Fixed some places where I said "ecx" and "edx" instead of "rcx" and "rdx"; thanks to Mike Dimmick for catching it.]


Comments (32)
  1. Mike Dimmick says:

    (Shouldn’t some of those references to ecx and edx be to rcx and rdx, i.e. doubleword registers?)

    I assume that using a single subtraction to adjust the stack for the whole duration of the function – including function call parameters – simplifies the exception unwind procedure.

    Context: SEH exceptions on AMD64 (for 64-bit programs) are table-based, NOT based on an exception handler chain at fs:[0] as on x86. Raymond, any idea why x86 is the only architecture which uses this frame-based exception handler chain?

  2. Raymond Chen says:

    Actually, that ecx and rdx is correct, per rule 2: "Parameters that are smaller than 64 bits are not zero-extended." Since the parameters are 32-bit integers, the values are passed in ecx, ecx, r8d and r9d. (r8d and r9d are the pseudo registers that represent the bottom 32 bits of the 64-bit r8 and r9 registers.)

    As to why x86 is the only platform that uses frame-based exception handling: I have no idea. Just further evidence that x86 is the weirdo.

  3. Raymond Chen says:

    Oh wait in the discussion paragraphs I used ecx and edx instead of rcx and rdx, right. Good catch, Mike.

  4. Phil Lucido says:

    You can only use a table-based exception handling scheme when you can reliably walk up the stack at any point. That’s not possible on x86 without breaking backward compatibility, given the profusion of private calling conventions used in code written in asm. I’ve always assumed that the original NT design team didn’t switch to a table-based scheme at the start because they needed to make it easy to port such code.

  5. Peter Lund says:

    (you still use ecx/edx in the code)

    Are the XMM registers ever used for parameter passing?

    Which function (caller/callee) should save which XMM registers?

  6. Raymond Chen says:

    ecx/edx: I discussed this in an earlier comment: http://weblogs.asp.net/oldnewthing/archive/2004/01/14/58579.aspx#58683

    I do not believe that the XMM registers are involved in parameter passing. I don’t have the XMM rules memorized; I’ll lok them up when I get back from vacation.

  7. Peter Lund says:

    Oops, sorry about that, Raymond :(

  8. Raymond Chen says:

    In case anybody was still keeping score: I looked up the XMM rules. The XMM registers are used for passing floating point parameters. XMM0 through XMM3 receive the first four floating point parameters. They, as well as XMM4 and XMM5 are scratch. XMM8 through XMM15 are preserved.

  9. Raymond, I believe the second diagram is incorrect, the return address should be after the arguments as such:

    xxxxxxx8 .. rest of stack (minus arg area) ..

    xxxxxxx0 (arg5)

    xxxxxxx8 (arg4 spill)

    xxxxxxx0 (arg3 spill)

    xxxxxxx8 (arg2 spill)

    xxxxxxx0 (arg1 spill) <- “.. rest of stack ..” from first diagram

    xxxxxxx8 return address <- RSP upon entry to callee

    It is implied in your comments, but worth calling out independently that leaf frames are not required to align the stack to 16byes.

  10. Raymond Chen says:

    Actually the diagram is correct – we’re just drawing different diagrams. My diagram is the stack layout of the caller *before* it calls the child function (and what’s more, a function that requires no stack slace for locals). Your diagram is the stack layout of the child function immediately *after* the "call" instruction.

    Both diagrams are correct; they are just diagrams of different things.

    Good point about the stack alignment at the leaf.

  11. Ahh — I see, I misread the diagram as being that upon entry to the callee… My bad!

  12. Will says:

    Is there any technical documentation that goes over the AMD64 calling convention described in this blog? Not that the information above isn’t good enough, just that it’d be nice to see a reference. =)

  13. Flier's Sky says:

    The history of calling conventions

  14. Raymond Chen says:

    Commenting closes after two weeks. (Okay, I was late to this one.)

    http://weblogs.asp.net/oldnewthing/archive/2004/02/21/77681.aspx

  15. Ever since v1, corprof.idl has contained the following ominous comment above the typedefs for FunctionEnter/Leave/Tailcall….

  16. Ever since v1, corprof.idl has contained the following ominous comment above the typedefs for FunctionEnter/Leave/Tailcall….

  17. Ever since v1, corprof.idl has contained the following ominous comment above the typedefs for FunctionEnter/Leave/Tailcall….

  18. Ever since v1, corprof.idl has contained the following ominous comment above the typedefs for FunctionEnter/Leave/Tailcall….

  19. Official (though preliminary) documentation.

  20. I just passed a milestone: I just had my first experience where I desperately needed an answer and search…

  21. code-o-rama says:

    As you know if you are a Visual Studio Team System user, we provide two types of profilers with the product;

  22. As you know if you are a Visual Studio Team System user, we provide two types of profilers with the product;

  23. Channel 9 says:

    Jeez, you expect an answer in 12 minutes?

Comments are closed.


*DISCLAIMER: I DO NOT OWN THIS CONTENT. If you are the owner and would like it removed, please contact me. The content herein is an archived reproduction of entries from Raymond Chen's "Old New Thing" Blog (most recent link is here). It may have slight formatting modifications for consistency and to improve readability.

WHY DID I DUPLICATE THIS CONTENT HERE? Let me first say this site has never had anything to sell and has never shown ads of any kind. I have nothing monetarily to gain by duplicating content here. Because I had made my own local copy of this content throughout the years, for ease of using tools like grep, I decided to put it online after I discovered some of the original content previously and publicly available, had disappeared approximately early to mid 2019. At the same time, I present the content in an easily accessible theme-agnostic way.

The information provided by Raymond's blog is, for all practical purposes, more authoritative on Windows Development than Microsoft's own MSDN documentation and should be considered supplemental reading to that documentation. The wealth of missing details provided by this blog that Microsoft could not or did not document about Windows over the years is vital enough, many would agree an online "backup" of these details is a necessary endeavor. Specifics include:

<-- Back to Old New Thing Archive Index