让我们从创建一个示例开始:
// foo.c
int foo() { return 42; }
// bar.c
#include <stdio.h>
extern int foo();
int bar()
{
printf(" %s:%d: &foo = %p\n", __FILE__, __LINE__, &foo);
return foo();
}
// main.c
#include <stdio.h>
extern int foo();
extern int bar();
int main()
{
printf("%s:%d: &foo = %p\n", __FILE__, __LINE__, &foo);
return bar();
}
通过以下方式构建:
gcc -g -fPIC -shared -o foo.so foo.c &&
gcc -g -fPIC -shared -o bar.so bar.c &&
gcc -g main.c ./bar.so ./foo.so -no-pie
(-no-pie
不是必需的,但更易于调试).
$ ./a.out
main.c:7: &foo = 0x7f3d66a180f9
bar.c:6: &foo = 0x7f3d66a180f9
啊,真灵.现在我们准备回答"魔术是如何发生的?"
首先,让我们来看看main
个反汇编:
gdb -q ./a.out
Reading symbols from ./a.out...
(gdb) disas main
Dump of assembler code for function main:
0x0000000000401136 <+0>: push %rbp
0x0000000000401137 <+1>: mov %rsp,%rbp
0x000000000040113a <+4>: mov 0x2e9f(%rip),%rax # 0x403fe0
0x0000000000401141 <+11>: mov %rax,%rcx
0x0000000000401144 <+14>: mov $0x7,%edx
0x0000000000401149 <+19>: lea 0xeb4(%rip),%rax # 0x402004
0x0000000000401150 <+26>: mov %rax,%rsi
0x0000000000401153 <+29>: lea 0xeb1(%rip),%rax # 0x40200b
0x000000000040115a <+36>: mov %rax,%rdi
0x000000000040115d <+39>: mov $0x0,%eax
0x0000000000401162 <+44>: callq 0x401040 <printf@plt>
0x0000000000401167 <+49>: mov $0x0,%eax
0x000000000040116c <+54>: callq 0x401030 <bar@plt>
0x0000000000401171 <+59>: pop %rbp
0x0000000000401172 <+60>: retq
在这里我们可以看到,printf
的最后一个参数来自于在地址0x403fe0
处加载值.那个地址是什么?
readelf -WS a.out | grep '\.got'
[22] .got PROGBITS 0000000000403fd0 002fd0 000018 08 WA 0 0 8
[23] .got.plt PROGBITS 0000000000403fe8 002fe8 000028 08 WA 0 0 8
显然那个地址是&.got[2]
.价值如何,并向上看?回到GDB:
(gdb) watch *(void**)0x403fe0
Hardware watchpoint 1: *(void**)0x403fe0
(gdb) run
Starting program: /tmp/shlib/a.out
Hardware watchpoint 1: *(void**)0x403fe0
Old value = (void *) 0x0
New value = (void *) 0x7ffff7fba0f9
elf_dynamic_do_Rela (skip_ifunc=<optimized out>, lazy=<optimized out>, nrelative=<optimized out>, relsize=<optimized out>, reladdr=<optimized out>, scope=<optimized out>, map=0x7ffff7ffe2e0) at ../sysdeps/x86_64/dl-machine.h:408
408 ../sysdeps/x86_64/dl-machine.h: No such file or directory.
(gdb) bt
#0 elf_dynamic_do_Rela (skip_ifunc=<optimized out>, lazy=<optimized out>, nrelative=<optimized out>, relsize=<optimized out>, reladdr=<optimized out>, scope=<optimized out>, map=0x7ffff7ffe2e0) at ../sysdeps/x86_64/dl-machine.h:408
#1 _dl_relocate_object (l=l@entry=0x7ffff7ffe2e0, scope=<optimized out>, reloc_mode=<optimized out>, consider_profiling=<optimized out>, consider_profiling@entry=0) at ./elf/dl-reloc.c:301
#2 0x00007ffff7fe8c09 in dl_main (phdr=<optimized out>, phnum=<optimized out>, user_entry=<optimized out>, auxv=<optimized out>) at ./elf/rtld.c:2322
#3 0x00007ffff7fe519f in _dl_sysdep_start (start_argptr=start_argptr@entry=0x7fffffffd960, dl_main=dl_main@entry=0x7ffff7fe6e10 <dl_main>) at ../sysdeps/unix/sysv/linux/dl-sysdep.c:140
#4 0x00007ffff7fe6b1c in _dl_start_final (arg=<error reading variable: Cannot access memory at address 0xffffd8c8>) at ./elf/rtld.c:497
#5 _dl_start (arg=<optimized out>) at ./elf/rtld.c:584
#6 0x00007ffff7fe59c8 in _start () from /lib64/ld-linux-x86-64.so.2
因此,运行时加载器将该值放在那里作为重新定位a.out
的一部分(在第1帧中,您可以看到l->addr == 0
和l->name == ""
,它们对应于主可执行文件).
是什么原因导致加载程序在未被调用的情况下解析foo
?
readelf -Wr a.out | egrep 'foo|bar'
0000000000403fe0 0000000500000006 R_X86_64_GLOB_DAT 0000000000000000 foo + 0
0000000000404000 0000000200000007 R_X86_64_JUMP_SLOT 0000000000000000 bar + 0
在这里,您可以看到调用函数(此处为bar
)和获取函数地址(此处为foo
)将导致different条重定位记录.
JUMP
重定位可以延迟解析(调用函数时),但GLOB_DAT
不能.加载程序必须在加载时解析所有GLOB_DAT
个重新定位,它确实做到了.
同样,在bar.so
年里,我们有:
gdb -q ./bar.so
(gdb) disas bar
0x0000000000001109 <+0>: push %rbp
0x000000000000110a <+1>: mov %rsp,%rbp
0x000000000000110d <+4>: mov 0x2ebc(%rip),%rax # 0x3fd0
0x0000000000001114 <+11>: mov %rax,%rcx
0x0000000000001117 <+14>: mov $0x6,%edx
0x000000000000111c <+19>: lea 0xedd(%rip),%rax # 0x2000
...
readelf -Wr bar.so | grep foo
0000000000003fd0 0000000400000006 R_X86_64_GLOB_DAT 0000000000000000 foo + 0
readelf -WS bar.so | grep '\.got'
[11] .plt.got PROGBITS 0000000000001040 001040 000010 08 AX 0 0 8
[20] .got PROGBITS 0000000000003fc0 002fc0 000028 08 WA 0 0 8
[21] .got.plt PROGBITS 0000000000003fe8 002fe8 000020 08 WA 0 0 8
因此,在加载时,&foo
也会填入&bar.so:.got[2]
.
附注:我们还可以查看readelf -Wr a.out bar.so
的输出,以了解存在哪些其他位置调整以及为什么GOT的第三个插槽中填充了&foo
:
File: a.out
Relocation section '.rela.dyn' at offset 0x4f0 contains 3 entries:
Offset Info Type Symbol's Value Symbol's Name + Addend
0000000000403fd0 0000000100000006 R_X86_64_GLOB_DAT 0000000000000000 __libc_start_main@GLIBC_2.34 + 0
0000000000403fd8 0000000400000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0
0000000000403fe0 0000000500000006 R_X86_64_GLOB_DAT 0000000000000000 foo + 0
...
File: bar.so
Relocation section '.rela.dyn' at offset 0x3f8 contains 8 entries:
Offset Info Type Symbol's Value Symbol's Name + Addend
0000000000003df0 0000000000000008 R_X86_64_RELATIVE 1100
0000000000003df8 0000000000000008 R_X86_64_RELATIVE 10c0
0000000000004008 0000000000000008 R_X86_64_RELATIVE 4008
0000000000003fc0 0000000100000006 R_X86_64_GLOB_DAT 0000000000000000 _ITM_deregisterTMCloneTable + 0
0000000000003fc8 0000000300000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0
0000000000003fd0 0000000400000006 R_X86_64_GLOB_DAT 0000000000000000 foo + 0
0000000000003fd8 0000000500000006 R_X86_64_GLOB_DAT 0000000000000000 _ITM_registerTMCloneTable + 0
0000000000003fe0 0000000600000006 R_X86_64_GLOB_DAT 0000000000000000 __cxa_finalize@GLIBC_2.2.5 + 0
这两个重新定位记录恰好位于.got
中的第三个槽,这是巧合--槽可能很容易不同.