Don’t trust the return address, no really

Date:August 17, 2006 / year-entry #279
Tags:code
Orig Link:https://blogs.msdn.microsoft.com/oldnewthing/20060817-17/?p=30073
Comments:    18
Summary:In the discussion of how to prevent non-"trusted" DLLs from using private OS resources, more than one person suggested having the LoadLibrary or FindResource function behave differently depending on who the caller is. But we already saw that you can't trust the return address and that you definitely shouldn't use the return address to make...

In the discussion of how to prevent non-"trusted" DLLs from using private OS resources, more than one person suggested having the LoadLibrary or FindResource function behave differently depending on who the caller is. But we already saw that you can't trust the return address and that you definitely shouldn't use the return address to make a security decision (which is what these people are proposing).

All attackers have to do is find some other "trusted" code to do their dirty work. For example, the LoadString function internally calls FindResource to locate the appropriate string bundle. Therefore, if attackers want to get a string resource from a "trusted" DLL, they could use LoadString to do it, since LoadString will call FindResource, and FindResource will say, "Oh, my caller is USER32.DLL, which is trusted." Bingo, they just stole a string resource.

"Well, I could add that same check to LoadString."

I was just giving LoadString as an example of a "middle man" function that you can exploit. Sure, extra code could be added to LoadString to check its return address and reject attempts to load strings from "protected" libraries if the caller is "untrusted", but attackers would just look for some other middle man they could exploit. And even if you were diligent enough to protect all such potential middle-men, you still are vulnerable to the sort of stack-manipulation games that don't require anything from a "trusted" DLL aside from a return instruction. (And there are plenty of those.)

No, you cannot impose security boundaries within a process. Once you let code run unchecked in your process, you have to treat the entire process as compromised. Even the parts that you thought were trustworthy.

Now, you might say, "Oh, we're not really making a security decision here. We just want to make circumventing the system so much hard work that somebody who goes to that much effort knows that they're doing something unsupported." But as commenter Duncan Bayne points out, that applies only to the first person to do it. They then make a library out of their technique, or publish it in a magazine article, and now anybody can use it without a struggle, and consequently without it crossing their mind that "Gosh, maybe this isn't such a great idea to use in production software."


Comments (18)
  1. new Poster() says:

    .Net uses code access security, where security decisions are made by walking up the call stack and performing access checks (obviously it’s more complicated then that).  Could you apply any of those principles to this problem?

  2. Jules says:

    The stuff in .net (and equivalents in Java, which also uses stack-frame based security contexts) works because the runtime environment makes it impossible to do funky stuff like modifying your stack to make it look like your function was called by a trusted one.

    It can’t work with native code, not so long as native code can freely modify its call stack (i.e. not on current x86 hardware).

  3. Jules says:

    I should comment that I’m only assuming .Net works like this, as I don’t have under-the-hood experience of it. I know Java does, though.

    See reference here: http://www.securingjava.com/chapter-three/

  4. Pazu says:

    .NET "stack" is different kind of beast from native stack.

  5. BryanK says:

    But you can still overwrite the .Net stack’s data, right?  Not from .Net code perhaps (and very likely not from verifiable code), but certainly from native code, assuming you can figure out where the .Net stack is and what you have to modify.

    [Letting untrusted managed code execute untrusted native code would be a disaster for the reasons you describe. -Raymond]
  6. Craig Ringer says:

    The stack / return address is not the only problem, either. Code from different modules (DLLs / shared objects) loaded in the same program shares the same address space.  This means that if untrusted code executes, it can just access the interesting bits of your "trusted" data structures directly.

    Finding those data structures in the rather large (and sparse) address space can be a challenge, but impossible. It’s often possible to scan the process’s memory looking for a signature that’s known to be constant, and using that with an offset to find the data structure you need, or a pointer that’ll get you to it. People use this technique all the time when doing dodgy things like patching system call tables (see, eg, binary only modules that add new system calls to Linux kernels; and some rootkits). Sometimes it’s really embarrassingly easy – like when the untrusted code provides some data that’ll be stored at a fixed offset from the data it wants to modify.

    Given that, my understanding is that you can’t share an address space and still protect trusted code from untrusted (native) code. You have to launch the untrusted code in a separate process (to get a separate address space) and use an IPC mechanism over carefully checked APIs. Sound right?

  7. nksingh says:

    Makes sense…  The usual use of CAS is to wrap some system library functions with security implications in managed code.  You can trust the system library functions not to do anything weird to the stack.  

    You definitely aren’t securing anything if you let someone run native code that would mess with the managed stack.  I don’t think you could even use "unverifiable" managed code to do this because you don’t exactly have access to registers and you can’t find out your stack pointer.  I might be wrong, however, and the main thread’s stack might in fact be in a constant location in all managed processes.

  8. Mike Hearn says:

    Java and .NET use provable type systems so this isn’t an issue. You can do the same thing for C/C++ code, see research by Lattner et al into proving the memory safety of arbitrary C/C++ programs (can often be done with only small modifications to the code).

    In the general case though you can only enforce security decisions at the process level. If you want to do anything more fine grained, you gotta split it into multiple processes using some kind of RPC. My dissertation at university was on this.

  9. BryanK says:

    [Letting untrusted managed code execute untrusted native code would be a disaster for the reasons you describe.]

    Oh, duh, that’s right; you can’t make P/Invoke calls to native code unless you’re already at the full-trust level.

    So it’d have to be one call stack that has full-trust permissions all the way down, making P/Invoke calls that trample on another call stack’s values (maybe in another thread?).  Sounds much more difficult to me, but perhaps not.

    Mike Hearn: I haven’t read that research on provable type systems; is it available somewhere online?

  10. new Poster() says:

    re: Mike: “In the general case though you can only enforce security decisions at the process level. If you want to do anything more fine grained, you gotta split it into multiple processes using some kind of RPC. ”

    Are you referring to unmanaged or managed code?

    In .net there are many different security boundaries, including processes, assemblies and appdomains. Different security evidence can be associated with each and different permission sets can be granted to each one. The assembly and appdomain boundaries all occur within the same process so different security perminssion grants can occur within the same process, not just at the process boundary.

    [You’re talking about a specific case. Mike is talking about the general case. -Raymond]
  11. BryanK says:

    I wonder how .net does its "code access security" stuff.  (And you probably don’t know, but perhaps another commenter does?)  It seems to do a stack-walk (for at least some types of security demands) and inspects the "evidence" provided by each stack frame (e.g., where the code came from).  But if the stack is that easy to spoof, then I wonder how that’s supposed to be any more secure.

    Or maybe it only works if you run verifiable code, and throws SecurityExceptions otherwise?  Seems like non-verifiable stuff would be fairly common for innocuous stuff, though (e.g. manipulating bits in a 1-bit image), and you’d have to fail all security demands if the process had *ever* run any non-verifiable code, because you have no idea what that code did.

    Or maybe CAS is just a "marketing term" type of thing, not really anything that you’re supposed to use to make security decisions.  Hmm.

    (And obviously it doesn’t apply to Win32 APIs.)

  12. new Poster() says:

    Raymond,

    .Net is an example of a working system that does not require a separate process to enforce a security decision. This refutes the statement “you can only enforce security decisions at the process level”. I don’t quite see how making a distinction between a specific case versus the general case changes this; please explain. Thanks.

    [Sure, you can do it in specific cases, but you can’t do it in general. It appeared to me that in your comment you tried to use a specific case (.NET managed code) to refute a general statement. The general statement still stands. -Raymond]
  13. silkio says:

    .net does require a seperate process; the .net vm. your app runs inside that.

  14. Ted says:

    All code in managed .NET is already "trusted" in that it obeys the rules of the framework runtime. The "security" checks are still a matter of the calling code telling the target that it has necesary permissions and the called code ‘trusts’ the calling code because it is dot net managed code.

    The framework takes away the ability for the caller to lie to the callee. Thus a ‘security’ check is simply a logic check and effectively is looking at an additional function param that is a bitfield of security permissions.

  15. Intel says:

    Obviously we need to not look at *who* is calling the function, but with what intentions. All processors should therefore extend the flags register with an Evil flag. When set, the process is up to no good and appropriate action can be taken.

  16. Mike Hearn says:

    BryanK – for checking memory safety of C programs see here:

    http://llvm.org/pubs/2005-02-TECS-SAFECode.html

    The RPC implementation I wrote for my dissertation was designed for this exact case – where you want to run, for example, the image decoder part of a chat program at a lower level of privilege to the rest of it. That way if the image decoder is compromised, the attacker has no useful privileges.

    So, unlike DCOM it was designed to be very high performance and easy to integrate with existing codebases (on the order of a few lines of code to split a function into a less privileged subprocess) at the expensive of losing language neutrality (C/C++ only!), network transparency and portability (x86 only). But you don’t need complex language/network transparent type systems a la DCOM to split a program into multiple privilege restricted processes.

    There were a bunch of other limitations to the sample implementation: Linux only, not thread safe. But those could be fixed with time.

    I haven’t put this anywhere online yet but once I complete the code that prevents malicious stack base manipulation I’ll be sending it to the SELinux and AppArmor mailing lists. Let me know if you’re interested.

  17. Doug says:

    Actually, .NET does NOT require managed programs to follow the rules.  There are five different possible configurations for code running under the CLR:

    Managed, verifiable IL.

    Managed, unverifiable IL.

    Managed, native executable code.

    Unmanaged, unverifiable IL.

    Unmanaged, native executable code.

    Managed/Unmanaged refers to whether the code makes use of the CLR’s garbage collection and type systems.  You must do this to be verifiable, but if you don’t need to be verifiable, you don’t have to use these facilities.

    You might be surprised to learn that the CLR will happily run x86 assembly code.  That code may or may not use the CLR type system, and may or may not use garbage collection.

    You might be surprised to learn that IL contains instructions for arbitrary memory access: read from address X, write to address X, jump to address X.  These instructions are off-limits to verifiable IL, but IL does not have to be verifiable.  Straight C++ (including pointers, unchecked array access, and even buffer overflows) can be compiled to IL bytecode and run under the CLR.

    Before anybody panics, please realize that this does nothing to the safety of managed code.  If you do not have the full trust of the CLR, you can only run managed verifiable IL, and you can only call functions that are explicitly marked as "I am safe to call from untrusted code".

    The upshot of this is that fully trusted code in the CLR can do whatever it wants, subject only to the limits of the OS.  The CLR won’t get in its way unless you ask it to do so.  Untrusted code cannot do anything unsafe unless there is a security flaw in a trusted library that is marked as "can be called from untrusted code".

    The nice thing about the CLR is that you CAN ask it "please stop me from doing stupid things, even though I have full trust".  When you do this, you can very easily enumerate the places where you could possibly corrupt the process (i.e. the places where you do a PInvoke, use "unsafe" code, interop with COM, or use the Marshal class), and you can very carefully go over those few places to make extra sure they are bug free.

  18. Mike Dimmick says:

    To be clear on the .NET issue: the default for code executing from the local machine is Full Trust – nonverifiable code can be loaded and in effect, Code Access Security is disabled. For code running from a network share, however, CAS is on, and the program can’t use P/Invoke, unsafe code, etc directly. Indeed, it can’t open files directly from the file system either, only through the use of the OpenFileDialog’s OpenFile method – it cannot get access to the actual file name.

Comments are closed.


*DISCLAIMER: I DO NOT OWN THIS CONTENT. If you are the owner and would like it removed, please contact me. The content herein is an archived reproduction of entries from Raymond Chen's "Old New Thing" Blog (most recent link is here). It may have slight formatting modifications for consistency and to improve readability.

WHY DID I DUPLICATE THIS CONTENT HERE? Let me first say this site has never had anything to sell and has never shown ads of any kind. I have nothing monetarily to gain by duplicating content here. Because I had made my own local copy of this content throughout the years, for ease of using tools like grep, I decided to put it online after I discovered some of the original content previously and publicly available, had disappeared approximately early to mid 2019. At the same time, I present the content in an easily accessible theme-agnostic way.

The information provided by Raymond's blog is, for all practical purposes, more authoritative on Windows Development than Microsoft's own MSDN documentation and should be considered supplemental reading to that documentation. The wealth of missing details provided by this blog that Microsoft could not or did not document about Windows over the years is vital enough, many would agree an online "backup" of these details is a necessary endeavor. Specifics include:

<-- Back to Old New Thing Archive Index