现在的位置: 首页 > 综合 > 正文

Load-time relocation of shared libraries

2013年10月05日 ⁄ 综合 ⁄ 共 18320字 ⁄ 字号 评论关闭
文章目录

一篇关于装载时重定位技术的文章,翻译的不好,还望批评指正!

免费PDF文档下载地址:http://ishare.iask.sina.com.cn/f/35236483.html 

或者: http://wenku.baidu.com/view/d67a3108a6c30c2259019e6a.html

 

 

This article’s aim is to explain how a modern operating system makes it possible to use shared libraries with load-time relocation. It focuses on the Linux OS running on 32-bit x86, but the general principles apply to other OSes and CPUs as well.

 这篇文章的目的是描述现代操作系统如何利用“装载时重定位(load-time relocation)”技术使用共享库的。实验平台是32位的linux操作系统 —— 当然这些原理也适用于其他的操作系统与CPU架构。

 

Note that shared libraries have many names – shared libraries, shared objects, dynamic shared objects (DSOs), dynamically linked libraries (DLLs – if you’re coming from a Windows background). For the sake of consistency, I will try to just use the name "shared
library" throughout this article.

 注意:共享库还有其他很多的称呼 —— 共享库(shared libraries),共享对象(shared objects),动态共享对象(DSOs),动态共享库(DLLs — windows操作系统)。为了上下文的一致性,本文中使用"shared library"来表示共享库。

 

Loading executables

Linux, similarly to other OSes with virtual memory support, loads executables to a fixed memory address. If we examine the ELF header of some random executable, we’ll see anEntry point address:

Linux操作系统和其他的操作系统一样,都支持虚拟内存,都加载可执行文件到固定的虚拟内存位置(linux — 0x08040000, windows — 0x0040000)。如果我们查看可执行文件的ELF文件头,就会注意到入口地址(Entry point

$ readelf -h /usr/bin/uptime
ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF32
  [...] some header fields
  Entry point address:               0x8048470
  [...] some header fields

 

链接器设置入口地址来告诉操作系统从哪里开始执行程序。 如果用GDB来调试程序的话,就会发现0x8048470是程序文本段(.text segment)的第一条指令的地址。

 

What this means is that the linker, when linking the executable, can fully resolve allinternal symbol references (to functions and data) to fixed and final locations. The linker does some relocations of its own[2],
but eventually the output it produces contains no additional relocations.

意思就是说,当链接器链接可执行文件的时候,可以完全解析内部的符号引用(这些符号或者引用函数,或者引用数据),进行相应的重定位操作,并且确定最终的内存位置,一旦确定就不容更改。 当然,链接器程序自身也会为自己进行一些必要地重定位操作,只不过这些信息不会输出。

 

Or does it? Note that I emphasized the word internal in the previous paragraph. As long as the executable needs no shared libraries[3], it needs no relocations. But if itdoes
use shared libraries (as do the vast majority of Linux applications), symbols taken from these shared libraries need to be relocated, because of how shared libraries are loaded.

这是真的吗?请注意,上文中我特地强调是内部符号。 我们知道,如果可执行文件运行时不需要用到共享库,那么自然地就没有重定位操作,但是如果要用到共享库的话(就像绝大多数的Linux应用程序一样),就会有重定位操作 —— 装载时重定位。因为共享库是在程序运行时装载入内存的,所以程序对共享库内的符号的引用只有在装载时才可以解析与重定位。

 

Loading shared libraries

Unlike executables, when shared libraries are being built, the linker can’t assume a known load address for their code. The reason for this is simple. Each program can use any number of shared libraries, and there’s simply no way to know in advance where
any given shared library will be loaded in the process’s virtual memory. Many solutions were invented for this problem over the years, but in this article I will just focus on the ones currently used by Linux.

与可执行文件不同的是,共享对象在编译时不能假设自己在进程虚拟地址空间中的位置。原因很简单:我们可以想象一个程序运行时也许会链接很多共享库,所以提前知道一个共享库在虚拟地址空间中的位置几乎是不可能的。 这些年相应的有很多机制发明出来试图解决这个问题,但是在本文中,我们只关注装载时重定位这个机制。

 

But first, let’s briefly examine the problem. Here’s some sample C code
[4]
which I compile into a shared library:

首先,让我们来一起看看这个问题。 以下是一段简单的C程序,我将它编译成共享库:

int myglob = 42;

int ml_func(int a, int b)
{
    myglob += a;
    return b + myglob;
}

 

Note how ml_func references
myglob
a few times. When translated to x86 assembly, this will involve a
mov
instruction to pull the value of myglob from its location in memory into a register.mov requires an absolute address – so how does the linker know which address to place in it? The
answer is – it doesn’t. As I mentioned above, shared libraries have no pre-defined load address – it will be decided at runtime.

可以看到函数ml_func引用了全局变量myglob好几次。对应x86汇编的话,会被翻译成一条mov指令 —— 从myglob变量的内存地址处取出其值,并放到一个寄存器中。 我们知道,mov指令需要一个绝对地址,所以链接器如何知道这个绝对地址呢?答案是链接器不知道,正如上文所说的,共享库编译时无法提前知道自己在进程虚拟地址空间中的位置,只有到程序运行时,共享库加载进内存后才能确定这些地址。

 

In Linux, the dynamic loader
[5]
is a piece of code responsible for preparing programs for running. One of its tasks is to load shared libraries from disk into memory, when the running executable requests them. When a shared library is loaded into memory, it is then adjusted for its
newly determined load location. It is the job of the dynamic loader to solve the problem presented in the previous paragraph.

Linux操作系统中,动态链接器(dynamic loader or dynamic linker)负责完成所有的动态链接工作以后再把控制权交给程序,然后程序开始执行。 它的任务之一是当程序运行时需要共享库的时候,将共享库从磁盘加载进内存中。 当共享库加载进内存后,动态链接器就会根据实际加载的地址来重定位相应的符号引用。 这就是动态链接器的主要工作内容。

 

There are two main approaches to solve this problem in Linux ELF shared libraries:

  1. Load-time relocation
  2. Position independent code (PIC)

Linux ELF共享库中,主要有两个途径可以解决这个问题:

1.装载时重定位

2.地址无关代码(PIC

 

Although PIC is the more common and nowadays-recommended solution, in this article I will focus on load-time relocation. Eventually I plan to cover both approaches and write a separate article on PIC, and I think starting with load-time relocation
will make PIC easier to explain later. (Update 03.11.2011:
the article about PIC
was published)

尽管地址无关代码(PIC)是现在最常用及推荐的方法,不过本文中我只会介绍装载时重定位。 因为我觉得先弄明白“装载时重定位”是怎么回事,那么理解什么是“地址无关代码”就很容易了。

 

Linking the shared library for load-time relocation

To create a shared library that has to be relocated at load-time, I’ll compile it without the-fPIC flag (which would otherwise trigger PIC generation):

要创建一个加载时可以重定位的共享库,可以在编译时不加-fPIC选项(这个选项会生成地址无关代码):

gcc -g -c ml_main.c -o ml_mainreloc.o
gcc -shared -o libmlreloc.so ml_mainreloc.o

 

The first interesting thing to see is the entry point oflibmlreloc.so:

第一件有意思的事是libmlreloc.so的入口地址:

$ readelf -h libmlreloc.so
ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF32
  [...] some header fields
  Entry point address:               0x3b0
  [...] some header fields

For simplicity, the linker just links the shared object for address
0x0
(the .text section starting at
0x3b0
), knowing that the loader will move it anyway. Keep this fact in mind – it will be useful later in the article.

可以看到,为简单起见,链接器假设共享库的加载地址为0x0(文本段.text0x3b0开始),因为它知道加载器最终会忽略这个地址,而将共享库加载入实际的内存。先记着这个结论,它在后文中将会很有用。

 

Now let’s look at the disassembly of the shared library, focusing onml_func:

现在让我们来看看共享库的反汇编代码,主要看函数ml_func的反汇编:

$ objdump -d -Mintel libmlreloc.so

libmlreloc.so:     file format elf32-i386

[...] skipping stuff

0000046c <ml_func>:
 46c: 55                      push   ebp
 46d: 89 e5                   mov    ebp,esp
 46f: a1 00 00 00 00          mov    eax,ds:0x0
 474: 03 45 08                add    eax,DWORD PTR [ebp+0x8]
 477: a3 00 00 00 00          mov    ds:0x0,eax
 47c: a1 00 00 00 00          mov    eax,ds:0x0
 481: 03 45 0c                add    eax,DWORD PTR [ebp+0xc]
 484: 5d                      pop    ebp
 485: c3                      ret

[...] skipping stuff

 After the first two instructions which are part of the prologue
[6]
, we see the compiled version of myglob += a[7]. The value ofmyglob is taken from memory intoeax,
incremented bya (which is atebp+0x8) and then placed back into memory.

在最开始的两行指令之后,我们可以看到myglob += a语句对应的汇编代码。 变量myglob的值被传入寄存器eax中,再加上变量a的值(这个值存储在ebp+0x8地指处),最后将结果返回给变量myglob

 

But wait, the mov takes
myglob
? Why? It appears that the actual operand of
mov
is just 0x0
[8]
. What gives? This is how relocations work. The linker places some provisional pre-defined value (0x0 in this case) into the instruction stream, and then creates a special relocation entry pointing to this place. Let’s
examine the relocation entries for this shared library:

但是请注意,mov指令真的是从变量myglob内存地址处取值吗? 从上面的汇编代码看,很显然mov指令的操作数只是0x0。 难道变量myglob的地址为0x0? 这是怎么回事呢? —— 这就是重定位的工作方式。 链接器一般会在指令中放入一些临时的预定义的值(比如这里的0x0),然后生成特定的重定位入口(relocation entry)指向这个地方。 让我们来看看这个共享库的重定位入口:

$ readelf -r libmlreloc.so

Relocation section '.rel.dyn' at offset 0x2fc contains 7 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
00002008  00000008 R_386_RELATIVE
00000470  00000401 R_386_32          0000200C   myglob
00000478  00000401 R_386_32          0000200C   myglob
0000047d  00000401 R_386_32          0000200C   myglob
[...] skipping stuff

The rel.dyn section of ELF is reserved for dynamic (load-time) relocations, to be consumed by the dynamic loader. There are 3 relocation entries formyglob in the

section showed above, since there are 3 references tomyglob in the disassembly. Let’s decipher the first one.

ELF中的rel.dyn段被保留用来加载时重定位用的,也就是说动态链接器会用到该段。 可以看到,有3个重定位入口都是和变量myglob相关的,这是因为汇编代码中有3处引用了变量myglob。我们先看第一个关于myglob的重定位入口。

 

It says: go to offset 0×470 in this object (shared library), and apply relocation of typeR_386_32 to it for symbolmyglob. If we consult the ELF spec we see that relocation
typeR_386_32 means:
take the value at the offset specified in the entry, add the address of the symbol to it, and place it back into the offset.

可以看到:在共享库偏移0x470的地方,有个关于变量myglob的引用,并且对其重定位的类型是R_386_32类型。 如果我们参阅ELF格式说明书的话,就会了解到重定位类型R_386_32的意思是:取出重定位入口偏移处的值,在加上符号的实际地址,结果再存入偏移处。

 

What do we have at offset 0x470 in the object? Recall this instruction from the disassembly ofml_func:

那么,在偏移0x470处是什么值呢?让我们重新看看函数ml_func的反汇编:

46f:  a1 00 00 00 00          mov    eax,ds:0x0

a1 encodes the
mov
instruction, so its operand starts at the next address which is
0x470
. This is the 0x0 we see in the disassembly. So back to the relocation entry, we now see it says:add the address ofmyglob to the operand of thatmov
instruction
. In other words it tells the dynamic loader – once you perform actual address assignment, put the real address ofmyglob into0x470, thus replacing the operand ofmov
by the correct symbol value. Neat, huh?

a1mov的指令码,所以它的操作数开始于下一个地址,也就是0x470处。 这里我们看到偏移0x470处的值是0x0。回头再看下重定位入口,我们现在明白它在说什么了:就是将变量myglob的实际地址和mov指令的操作数相加,结果就是mov指令的真实操作数。换句话说,它想告诉动态链接器:一旦动态链接器重定位该处对变量myglob的引用时,那么就将变量myglob的实际地址放在偏移0x470处,那么mov指令的操作数就变成变量myglob的真实地址了。很灵活,是不是呢?

 

Note also the "Sym. value" column in the relocation section, which contains0x200C formyglob. This is the offset ofmyglob in the virtual memory
image of the shared library (which, recall, the linker assumes is just loaded at0x0). This value can also be examined by looking at the symbol table of the library, for example withnm:

请同样注意"Sym. value"这一列,可以看到myglobSym.value的值是0x200C,这是变量myglob在共享库中的偏移量(回想下,链接器假设共享库的加载地址是0x0)。同样的,这个值我们在符号表中依然可以看见,我们可以用nm查看符号表:

$ nm libmlreloc.so
[...] skipping stuff
0000200c D myglob

 This output also provides the offset of myglob inside the library.D means the symbol is in the initialized data section (.data).

同样的,这个输出给出了变量myglob在共享库中的偏移量。D说明这个符号是存储在数据段.data的。

 

Load-time relocation in action

To see the load-time relocation in action, I will use our shared library from a simple driver executable. When running this executable, the OS will load the shared library and relocate it appropriately.

为了看到装载时重定位是如何做的,我将会通过一个简单的程序来使用这个共享库。当这个程序运行时,操作系统会装载这个共享库并且做适当的重定位。

 

Curiously, due to the
address space layout randomization feature
which is enabled in Linux, relocation is relatively difficult to follow, because every time I run the executable, thelibmlreloc.so shared library gets placed in a different virtual
memory address[9].

不过因为Linux操作系统允许地址空间布局随机化机制,所以重定位变得难以追踪,这是因为每一次运行程序时,libmlreloc.so共享库会被加载进不同的虚拟地址空间中。

 

This is a rather weak deterrent, however. There is a way to make sense in it all. But first, let’s talk about the segments our shared library consists of:

然而,这种机制是较弱的防骇机制。有办法可以知道它到底在干什么。但首先,先让我们看看我们的共享库的segment是如何组成的:

Elf file type is DYN (Shared object file)
Entry point 0x3b0
There are 6 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x000000 0x00000000 0x00000000 0x004e8 0x004e8 R E 0x1000
  LOAD           0x000f04 0x00001f04 0x00001f04 0x0010c 0x00114 RW  0x1000
  DYNAMIC        0x000f18 0x00001f18 0x00001f18 0x000d0 0x000d0 RW  0x4
  NOTE           0x0000f4 0x000000f4 0x000000f4 0x00024 0x00024 R   0x4
  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x4
  GNU_RELRO      0x000f04 0x00001f04 0x00001f04 0x000fc 0x000fc R   0x1

 Section to Segment mapping:
  Segment Sections...
   00     .note.gnu.build-id .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .eh_frame
   01     .ctors .dtors .jcr .dynamic .got .got.plt .data .bss
   02     .dynamic
   03     .note.gnu.build-id
   04
   05     .ctors .dtors .jcr .dynamic .got

To follow the myglob symbol, we’re interested in the second segment listed here. Note a couple of things:

对于追踪变量myglob,我们这里应该关注第二个segment。注意以下这些:

  • In the section to segment mapping in the bottom, segment 01 is said to contain the.data section, which is the home ofmyglob

  从底下的section to segment mapping,我们可以看到第1segment包含数据段.data,而我们的变量myglob就在数据段中。

  • The VirtAddr column specifies that the second segment starts at0x1f04 and has size0x10c, meaning that it extends until0x2010
    and thus containsmyglob which is at0x200C.

  VirtAddr列可以看到第2segment开始于0x1f04,并且大小为0x10c字节, 意思就是说这个segment会一直延伸到0x2010地址处,而包含的变量myglob的地址为0x200C

 

Now let’s use a nice tool Linux gives us to examine the load-time linking process – thedl_iterate_phdr function, which
allows an application to inquire at runtime which shared libraries it has loaded, and more importantly – take a peek at their program headers.

现在让我们利用Linux提供的有利工具 —— dl_iterate_phdr函数,来一起看看装载时链接的过程吧。dl_iterate_phdr函数允许一个程序在运行时查看自己用到了哪些共享库,并且最有用的是 —— 可以查看他们的程序头。

 

So I’m going to write the following code into driver.c:

#define _GNU_SOURCE
#include <link.h>
#include <stdlib.h>
#include <stdio.h>


static int header_handler(struct dl_phdr_info* info, size_t size, void* data)
{
    printf("name=%s (%d segments) address=%p\n",
            info->dlpi_name, info->dlpi_phnum, (void*)info->dlpi_addr);
    for (int j = 0; j < info->dlpi_phnum; j++) {
         printf("\t\t header %2d: address=%10p\n", j,
             (void*) (info->dlpi_addr + info->dlpi_phdr[j].p_vaddr));
         printf("\t\t\t type=%u, flags=0x%X\n",
                 info->dlpi_phdr[j].p_type, info->dlpi_phdr[j].p_flags);
    }
    printf("\n");
    return 0;
}


extern int ml_func(int, int);


int main(int argc, const char* argv[])
{
    dl_iterate_phdr(header_handler, NULL);

    int t = ml_func(argc, argc);
    return t;
}

header_handler implements the callback for
dl_iterate_phdr
. It will get called for all libraries and report their names and load addresses, along with all their segments. It also invokesml_func, which is taken from thelibmlreloc.so
shared library.

函数header_handler实现了dl_iterate_phdr所需的callback。该程序遍历所有的共享库,并且输出共享库的名字,输出共享库加载地址以及每个段的加载地址。当然,因为该程序调用了ml_func函数,所以该程序运行时会用到共享库libmlreloc.so

 

To compile and link this driver with our shared library, run:

编译该程序,并且将它与我们的共享库一起链接:

gcc -g -c driver.c -o driver.o
gcc -o driver driver.o -L. -lmreloc

Running the driver stand-alone we get the information, but for each run the addresses are different. So what I’m going to do is run it undergdb[10], see what it says, and
then usegdb to further query the process’s memory space:

运行driver程序,我们可以得到相应的输出,但是每一次的输出都不一样,所以我打算在gdb中运行该程序,并且利用gdb查看程序的地址空间:

$ gdb -q driver
 Reading symbols from driver...done.
 (gdb) b driver.c:31
 Breakpoint 1 at 0x804869e: file driver.c, line 31.
 (gdb) r
 Starting program: driver
 [...] skipping output
 name=./libmlreloc.so (6 segments) address=0x12e000
                header  0: address=  0x12e000
                        type=1, flags=0x5
                header  1: address=  0x12ff04
                        type=1, flags=0x6
                header  2: address=  0x12ff18
                        type=2, flags=0x6
                header  3: address=  0x12e0f4
                        type=4, flags=0x4
                header  4: address=  0x12e000
                        type=1685382481, flags=0x6
                header  5: address=  0x12ff04
                        type=1685382482, flags=0x4

[...] skipping output
 Breakpoint 1, main (argc=1, argv=0xbffff3d4) at driver.c:31
 31    }
 (gdb)

Since driver reports all the libraries it loads (even implicitly, likelibc or the dynamic loader itself), the output is lengthy and I will just focus on the report aboutlibmlreloc.so.
Note that the 6 segments are the same segments reported byreadelf, but this time relocated into their final memory locations.

driver程序会输出所有加载进内存的共享库(甚至一些隐含的载入,譬如libc或者动态链接器本身)的信息,有些冗长,不过我们只关心关于libmlreloc.so的输出。我们可以清楚的看到输出的6segmentreadelf输出的一模一样,只不过这次输出的是经过重定位之后的segment

 

Let’s do some math. The output says libmlreloc.so was placed in virtual address0x12e000. We’re interested in the second segment, which as we’ve seen inreadelf
is at ofset 0x1f04. Indeed, we see in the output it was loaded to address0x12ff04. And sincemyglob is at offset0x200c in
the file, we’d expect it to now be at address0x13000c.

让我们来做一些计算。从输出中可以看到共享库libmlreloc.so被加载到虚拟内存0x12e000处,从readelf的输出我们看到第2segment的偏移是0x1f04,所以经过相加得到第2segment的加载地址为0x12ff04。同样的道理,因为变量myglob在共享库中的的偏移是0x200c,所以经计算其加载地址为0x13000c

 

So, let’s ask GDB:

让我们用GDB验证一下:

(gdb) p &myglob
$1 = (int *) 0x13000c

 Excellent! But what about the code of ml_func which refers tomyglob? Let’s ask GDB again:

果然如此!那么函数ml_func又从哪里引用变量myglob呢?让我们再次用GDB来验证一下:

(gdb) set disassembly-flavor intel
(gdb) disas ml_func
Dump of assembler code for function ml_func:
   0x0012e46c <+0>:   push   ebp
   0x0012e46d <+1>:   mov    ebp,esp
   0x0012e46f <+3>:   mov    eax,ds:0x13000c
   0x0012e474 <+8>:   add    eax,DWORD PTR [ebp+0x8]
   0x0012e477 <+11>:  mov    ds:0x13000c,eax
   0x0012e47c <+16>:  mov    eax,ds:0x13000c
   0x0012e481 <+21>:  add    eax,DWORD PTR [ebp+0xc]
   0x0012e484 <+24>:  pop    ebp
   0x0012e485 <+25>:  ret
End of assembler dump.

As expected, the real address of myglob was placed in all themov instructions referring to it, just as the relocation entries specified.

正如预期的一样,变量myglob的实际地址成为了mov指令的操作数。

 

Relocating function calls

So far this article demonstrated relocation of data references – using the global variablemyglob as an example. Another thing that needs to be relocated is code references – in other words, function calls. This
section is a brief guide on how this gets done. The pace is much faster than in the rest of this article, since I can now assume the reader understands what relocation is all about.

到现在为止,这篇文章一直演示的都是对数据的引用 —— 使用全局变量myglob作为例子,那么对指令的引用呢?—— 换句话说,就是对函数的调用。好,这一节就让我们来看看共享库中对函数的引用在装载时是如何重定位的。不过讲述的速度要快些了,因为我相信大家对什么是重定位已经有了一定的认识了。

 

Without further ado, let’s get to it. I’ve modified the code of the shared library to be the following:

废话不多说了,我们开始。我已经修改了前面的程序,如下:

int myglob = 42;

int ml_util_func(int a)
{
    return a + 1;
}

int ml_func(int a, int b)
{
    int c = b + ml_util_func(a);
    myglob += c;
    return b + myglob;
}

ml_util_func was added and it’s being used byml_func. Here’s the disassembly ofml_func in the linked shared library:

新的程序新添了ml_util_func函数,这个函数会被ml_func函数调用。下面是函数ml_func的反汇编代码:

000004a7 <ml_func>:
 4a7:   55                      push   ebp
 4a8:   89 e5                   mov    ebp,esp
 4aa:   83 ec 14                sub    esp,0x14
 4ad:   8b 45 08                mov    eax,DWORD PTR [ebp+0x8]
 4b0:   89 04 24                mov    DWORD PTR [esp],eax
 4b3:   e8 fc ff ff ff          call   4b4 <ml_func+0xd>
 4b8:   03 45 0c                add    eax,DWORD PTR [ebp+0xc]
 4bb:   89 45 fc                mov    DWORD PTR [ebp-0x4],eax
 4be:   a1 00 00 00 00          mov    eax,ds:0x0
 4c3:   03 45 fc                add    eax,DWORD PTR [ebp-0x4]
 4c6:   a3 00 00 00 00          mov    ds:0x0,eax
 4cb:   a1 00 00 00 00          mov    eax,ds:0x0
 4d0:   03 45 0c                add    eax,DWORD PTR [ebp+0xc]
 4d3:   c9                      leave
 4d4:   c3                      ret

What’s interesting here is the instruction at address
0x4b3
– it’s the call to ml_util_func. Let’s dissect it:

有趣的是地址0x4b3处的指令 —— 这是调用ml_util_func函数的指令,让我们仔细分析它:

 

e8 is the opcode for
call
. The argument of this call is the offset relative to the next instruction. In the disassembly above, this argument is0xfffffffc, or simply-4.
So the call currently points to itself. This clearly isn’t right – but let’s not forget about relocation. Here’s what the relocation section of the shared library looks like now:

e8是指令call的指令码,其操作数是相对于下一条指令的偏移量。从上面的反汇编可以看到,操作数是0xfffffffc —— -4的补码,所以得出的结论是这条call指令指向的是自己 ——这很显然是不正确的。但是我们不要忘了重定位,重定位会解决这个问题。接下来我们一起看看现在共享库的重定位入口的情况:

$ readelf -r libmlreloc.so

Relocation section '.rel.dyn' at offset 0x324 contains 8 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
00002008  00000008 R_386_RELATIVE
000004b4  00000502 R_386_PC32        0000049c   ml_util_func
000004bf  00000401 R_386_32          0000200c   myglob
000004c7  00000401 R_386_32          0000200c   myglob
000004cc  00000401 R_386_32          0000200c   myglob
[...] skipping stuff

If we compare it to the previous invocation of
readelf -r
, we’ll notice a new entry added for
ml_util_func
. This entry points at address 0x4b4 which is the argument of thecall instruction, and its type isR_386_PC32. This relocation type is more
complicated thanR_386_32, but not by much.

如果我们比较前后两次readelf -r的输出结果的话,我们就会注意到这次的输出中多了一项关于对函数ml_util_func的引用的重定位入口。它的偏移量是0x4b4,这个数字正是call指令操作数的地址,不过它的重定位类型有所不同,是R_386_PC32类型的,这个重定位类型比R_386_32类型稍微复杂一些。

 

It means the following: take the value at the offset specified in the entry, add the address of the symbol to it, subtract the address of the offset itself, and place it back into the word at the offset. Recall
that this relocation is done at load-time, when the final load addresses of the symbol and the relocated offset itself are already known. These final addresses participate in the computation.

R_386_PC32重定位类型的意思是:先取出指定偏移处的值,与符号的实际地址相加,然后减去偏移值,最终的结果放回偏移处。回想一下,这个重定位过程是在加载时完成的,当共享库加载进内存后,那么符号的实际地址和偏移值都是已知的了,那么就很容易算出结果。

 

What does this do? Basically, it’s a relative relocation, taking its location into account and thus suitable for arguments of instructions with relative addressing (which thee8 call<

抱歉!评论已关闭.