Bootstrap

认识链接器

这时候我们还有两个疑问,一个是 inculde<stido.h> 和 call printf 并没有引入文件,为什么可以运行。

	.file	"demo.c"  // 文件名称
	.section	.rodata  
.LC0:
	.string	"Hello, World! "  
	.text               // 代码段  只读  数据为何数据代码段?
	.globl	main        // globl 全局
	.type	main, @function  // 类型 函数
main:                 // 函数首地址
	pushq	%rbp          // 开辟栈帧
	movq	%rsp, %rbp    // 开辟栈帧
	movl	$.LC0, %edi   // 将字符串移动到 edi 寄存器中
	call	puts          // 调用 puts 函数
	movl	$0, %eax      // 将 0 移动到 eax 寄存器中,清空 eax 寄存器用来存放函数返回值
	popq	%rbp          // 销毁栈
	ret                 // 销毁栈
	.size	main, .-main
	.ident	"GCC: (GNU) 4.8.5 20150623 (Red Hat 4.8.5-36)"
	.section	.note.GNU-stack,"",@progbits

puts 函数我们并没有定义,他是从何而来,这个时候我们就需要引入操作系统。

CPU引入了保护模式,和操作系统配合,保护了整个计算机,那么我们一个普通的应用程序需要去操作硬件和CPU,都需要经过操作系统,那么操作系统就需要提供一个入口,让上层去访问,然后操作系统在内部实现调用硬件,这里我们可以类比架构。

一个Web页面或者是APP,因为数据是存放到后台的,他需要后台与后台交互实现它的功能,所以后台就说,如果你需要访问我的数据,那么你就按照我定义的规则来访问我,然后我处理之后将数据交由你。

而这个规则就是写一个 接口比如说 /user/add 表示功能新增一个用户。

然后我对这个接口发起了请求,后端服务就需要从如何接受请求,如何再返回,中间出现的问题都需要进行处理,每个接口都这么写,就很烦,所以一门语言创造出来就需要对基础通用的功能进行封装,这里每个语言都需要 HTTP协议 的实现,所以每个语言还需要提供他的库函数。

而操作系统也有这样的接口,把这些接口都成为 系统调用 以 sys_ 开头,定义的规范。

而C语言是有标准的,ANSI C 和 GUN C 而 libc 则是 ANSI C 的库函数,glibc 则是 GUN C 的库函数。

而现在我就去调用这个函数,这个函数封装的了系统调用 。由此我们使用函数库间接的调用了操作系统提供的接口。这些都是公共的接口,不需要重复编写。

那么这些接口会被汇总到一个头文件中,我们引入这个头文件即可使用。最后 GCC将这些源文件进行解析,然后再编译成汇编文件。但是计算机识别的二进制文件,那么我们最后还需要将汇编文件转化为二进制可执行的文件,那么在各大操作系统中是不一样的,比如说 windows 是 exe ,UNIX 是 .out

将这些宏处理,和头文件首先和源文件进行解析汇总生成一个 .i 文件

然后将处理之后的文件编译成汇编代码生成一个 .s 文件

最后将这个 .s 文件转化为可执行文件也就是 .out .os .o

最后由汇编器编译成最终的目标文件,既然要编译,就需要指定一个规范,而这个规范就是

ELF (Executable and Linking Format)

但是这个接口是需要实现,那么他是怎么找到这个实现的呢?

两个代码是独立的文件,我们就需要找到对方的函数在内存中的地址是多少,就需要一个映射关系,使用地址和方法入口来找到 调用的指令的真实地址在哪里。

这就是链接器的任务,将符号解析,并在生成文件找到真实的地址。

链接器:符号解析+代码重定位

我们使用 readelf 命令来查看 ELF 格式,使用objdump 命令查询二进制反汇编

Usage: readelf <option(s)> elf-file(s)
 Display information about the contents of ELF format files
 Options are:
  -a --all               Equivalent to: -h -l -S -s -r -d -V -A -I
  -h --file-header       Display the ELF file header
  -l --program-headers   Display the program headers
     --segments          An alias for --program-headers
  -S --section-headers   Display the sections' header
     --sections          An alias for --section-headers
  -g --section-groups    Display the section groups
  -t --section-details   Display the section details
  -e --headers           Equivalent to: -h -l -S
  -s --syms              Display the symbol table
     --symbols           An alias for --syms
  --dyn-syms             Display the dynamic symbol table
  -n --notes             Display the core notes (if present)
  -r --relocs            Display the relocations (if present)
  -u --unwind            Display the unwind info (if present)
  -d --dynamic           Display the dynamic section (if present)
  -V --version-info      Display the version sections (if present)
  -A --arch-specific     Display architecture specific information (if any)
  -c --archive-index     Display the symbol/file index in an archive
  -D --use-dynamic       Use the dynamic section info when displaying symbols
  -x --hex-dump=<number|name>
                         Dump the contents of section <number|name> as bytes
  -p --string-dump=<number|name>
                         Dump the contents of section <number|name> as strings
  -R --relocated-dump=<number|name>
                         Dump the contents of section <number|name> as relocated bytes
  -z --decompress        Decompress section before dumping it
  -w[lLiaprmfFsoRt] or
  --debug-dump[=rawline,=decodedline,=info,=abbrev,=pubnames,=aranges,=macro,=frames,
               =frames-interp,=str,=loc,=Ranges,=pubtypes,
               =gdb_index,=trace_info,=trace_abbrev,=trace_aranges,
               =addr,=cu_index]
                         Display the contents of DWARF2 debug sections
  --dwarf-depth=N        Do not display DIEs at depth N or greater
  --dwarf-start=N        Display DIEs starting with N, at the same depth
                         or deeper
  -I --histogram         Display histogram of bucket list lengths
  -W --wide              Allow output width to exceed 80 characters
  @<file>                Read options from <file>
  -H --help              Display this information
  -v --version           Display the version number of readelf

编译并连接 gcc test.c

#include<stdio.h>

int main() {
    printf("%s", "hello world!");
    return 1;
}

readlef -a a.out

ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 // 魔数
  Class:                             ELF64 // 文件类型
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file) //可执行文件
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x400430
  Start of program headers:          64 (bytes into file)
  Start of section headers:          6456 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         9
  Size of section headers:           64 (bytes)
  Number of section headers:         31
  Section header string table index: 30

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .interp           PROGBITS         0000000000400238  00000238
       000000000000001c  0000000000000000   A       0     0     1
  [ 2] .note.ABI-tag     NOTE             0000000000400254  00000254
       0000000000000020  0000000000000000   A       0     0     4
  [ 3] .note.gnu.build-i NOTE             0000000000400274  00000274
       0000000000000024  0000000000000000   A       0     0     4
  [ 4] .gnu.hash         GNU_HASH         0000000000400298  00000298
       000000000000001c  0000000000000000   A       5     0     8
  [ 5] .dynsym           DYNSYM           00000000004002b8  000002b8
       0000000000000060  0000000000000018   A       6     1     8
  [ 6] .dynstr           STRTAB           0000000000400318  00000318
       000000000000003f  0000000000000000   A       0     0     1
  [ 7] .gnu.version      VERSYM           0000000000400358  00000358
       0000000000000008  0000000000000002   A       5     0     2
  [ 8] .gnu.version_r    VERNEED          0000000000400360  00000360
       0000000000000020  0000000000000000   A       6     1     8
  [ 9] .rela.dyn         RELA             0000000000400380  00000380
       0000000000000018  0000000000000018   A       5     0     8
  [10] .rela.plt         RELA             0000000000400398  00000398
       0000000000000030  0000000000000018  AI       5    24     8
  [11] .init             PROGBITS         00000000004003c8  000003c8
       000000000000001a  0000000000000000  AX       0     0     4
  [12] .plt              PROGBITS         00000000004003f0  000003f0
       0000000000000030  0000000000000010  AX       0     0     16
  [13] .plt.got          PROGBITS         0000000000400420  00000420
       0000000000000008  0000000000000000  AX       0     0     8
  [14] .text             PROGBITS         0000000000400430  00000430
       0000000000000182  0000000000000000  AX       0     0     16
  [15] .fini             PROGBITS         00000000004005b4  000005b4
       0000000000000009  0000000000000000  AX       0     0     4
  [16] .rodata           PROGBITS         00000000004005c0  000005c0
       0000000000000020  0000000000000000   A       0     0     8
  [17] .eh_frame_hdr     PROGBITS         00000000004005e0  000005e0
       0000000000000034  0000000000000000   A       0     0     4
  [18] .eh_frame         PROGBITS         0000000000400618  00000618
       00000000000000f4  0000000000000000   A       0     0     8
  [19] .init_array       INIT_ARRAY       0000000000600e10  00000e10
       0000000000000008  0000000000000008  WA       0     0     8
  [20] .fini_array       FINI_ARRAY       0000000000600e18  00000e18
       0000000000000008  0000000000000008  WA       0     0     8
  [21] .jcr              PROGBITS         0000000000600e20  00000e20
       0000000000000008  0000000000000000  WA       0     0     8
  [22] .dynamic          DYNAMIC          0000000000600e28  00000e28
       00000000000001d0  0000000000000010  WA       6     0     8
  [23] .got              PROGBITS         0000000000600ff8  00000ff8
       0000000000000008  0000000000000008  WA       0     0     8
  [24] .got.plt          PROGBITS         0000000000601000  00001000
       0000000000000028  0000000000000008  WA       0     0     8
  [25] .data             PROGBITS         0000000000601028  00001028
       0000000000000004  0000000000000000  WA       0     0     1
  [26] .bss              NOBITS           000000000060102c  0000102c
       0000000000000004  0000000000000000  WA       0     0     1
  [27] .comment          PROGBITS         0000000000000000  0000102c
       000000000000002d  0000000000000001  MS       0     0     1
  [28] .symtab           SYMTAB           0000000000000000  00001060
       0000000000000600  0000000000000018          29    47     8
  [29] .strtab           STRTAB           0000000000000000  00001660
       00000000000001cb  0000000000000000           0     0     1
  [30] .shstrtab         STRTAB           0000000000000000  0000182b
       000000000000010c  0000000000000000           0     0     1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  l (large), p (processor specific)

There are no section groups in this file.

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  PHDR           0x0000000000000040 0x0000000000400040 0x0000000000400040
                 0x00000000000001f8 0x00000000000001f8  R E    8
  INTERP         0x0000000000000238 0x0000000000400238 0x0000000000400238
                 0x000000000000001c 0x000000000000001c  R      1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                 0x000000000000070c 0x000000000000070c  R E    200000
  LOAD           0x0000000000000e10 0x0000000000600e10 0x0000000000600e10
                 0x000000000000021c 0x0000000000000220  RW     200000
  DYNAMIC        0x0000000000000e28 0x0000000000600e28 0x0000000000600e28
                 0x00000000000001d0 0x00000000000001d0  RW     8
  NOTE           0x0000000000000254 0x0000000000400254 0x0000000000400254
                 0x0000000000000044 0x0000000000000044  R      4
  GNU_EH_FRAME   0x00000000000005e0 0x00000000004005e0 0x00000000004005e0
                 0x0000000000000034 0x0000000000000034  R      4
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     10
  GNU_RELRO      0x0000000000000e10 0x0000000000600e10 0x0000000000600e10
                 0x00000000000001f0 0x00000000000001f0  R      1

 Section to Segment mapping:
  Segment Sections...
   00     
   01     .interp 
   02     .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .plt.got .text .fini .rodata .eh_frame_hdr .eh_frame 
   03     .init_array .fini_array .jcr .dynamic .got .got.plt .data .bss 
   04     .dynamic 
   05     .note.ABI-tag .note.gnu.build-id 
   06     .eh_frame_hdr 
   07     
   08     .init_array .fini_array .jcr .dynamic .got 

Dynamic section at offset 0xe28 contains 24 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x000000000000000c (INIT)               0x4003c8
 0x000000000000000d (FINI)               0x4005b4
 0x0000000000000019 (INIT_ARRAY)         0x600e10
 0x000000000000001b (INIT_ARRAYSZ)       8 (bytes)
 0x000000000000001a (FINI_ARRAY)         0x600e18
 0x000000000000001c (FINI_ARRAYSZ)       8 (bytes)
 0x000000006ffffef5 (GNU_HASH)           0x400298
 0x0000000000000005 (STRTAB)             0x400318
 0x0000000000000006 (SYMTAB)             0x4002b8
 0x000000000000000a (STRSZ)              63 (bytes)
 0x000000000000000b (SYMENT)             24 (bytes)
 0x0000000000000015 (DEBUG)              0x0
 0x0000000000000003 (PLTGOT)             0x601000
 0x0000000000000002 (PLTRELSZ)           48 (bytes)
 0x0000000000000014 (PLTREL)             RELA
 0x0000000000000017 (JMPREL)             0x400398
 0x0000000000000007 (RELA)               0x400380
 0x0000000000000008 (RELASZ)             24 (bytes)
 0x0000000000000009 (RELAENT)            24 (bytes)
 0x000000006ffffffe (VERNEED)            0x400360
 0x000000006fffffff (VERNEEDNUM)         1
 0x000000006ffffff0 (VERSYM)             0x400358
 0x0000000000000000 (NULL)               0x0

Relocation section '.rela.dyn' at offset 0x380 contains 1 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000600ff8  000300000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0

Relocation section '.rela.plt' at offset 0x398 contains 2 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000601018  000100000007 R_X86_64_JUMP_SLO 0000000000000000 printf@GLIBC_2.2.5 + 0
000000601020  000200000007 R_X86_64_JUMP_SLO 0000000000000000 __libc_start_main@GLIBC_2.2.5 + 0

The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported.

Symbol table '.dynsym' contains 4 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND printf@GLIBC_2.2.5 (2)
     2: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __libc_start_main@GLIBC_2.2.5 (2)
     3: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND __gmon_start__

Symbol table '.symtab' contains 64 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000400238     0 SECTION LOCAL  DEFAULT    1 
     2: 0000000000400254     0 SECTION LOCAL  DEFAULT    2 
     3: 0000000000400274     0 SECTION LOCAL  DEFAULT    3 
     4: 0000000000400298     0 SECTION LOCAL  DEFAULT    4 
     5: 00000000004002b8     0 SECTION LOCAL  DEFAULT    5 
     6: 0000000000400318     0 SECTION LOCAL  DEFAULT    6 
     7: 0000000000400358     0 SECTION LOCAL  DEFAULT    7 
     8: 0000000000400360     0 SECTION LOCAL  DEFAULT    8 
     9: 0000000000400380     0 SECTION LOCAL  DEFAULT    9 
    10: 0000000000400398     0 SECTION LOCAL  DEFAULT   10 
    11: 00000000004003c8     0 SECTION LOCAL  DEFAULT   11 
    12: 00000000004003f0     0 SECTION LOCAL  DEFAULT   12 
    13: 0000000000400420     0 SECTION LOCAL  DEFAULT   13 
    14: 0000000000400430     0 SECTION LOCAL  DEFAULT   14 
    15: 00000000004005b4     0 SECTION LOCAL  DEFAULT   15 
    16: 00000000004005c0     0 SECTION LOCAL  DEFAULT   16 
    17: 00000000004005e0     0 SECTION LOCAL  DEFAULT   17 
    18: 0000000000400618     0 SECTION LOCAL  DEFAULT   18 
    19: 0000000000600e10     0 SECTION LOCAL  DEFAULT   19 
    20: 0000000000600e18     0 SECTION LOCAL  DEFAULT   20 
    21: 0000000000600e20     0 SECTION LOCAL  DEFAULT   21 
    22: 0000000000600e28     0 SECTION LOCAL  DEFAULT   22 
    23: 0000000000600ff8     0 SECTION LOCAL  DEFAULT   23 
    24: 0000000000601000     0 SECTION LOCAL  DEFAULT   24 
    25: 0000000000601028     0 SECTION LOCAL  DEFAULT   25 
    26: 000000000060102c     0 SECTION LOCAL  DEFAULT   26 
    27: 0000000000000000     0 SECTION LOCAL  DEFAULT   27 
    28: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS crtstuff.c
    29: 0000000000600e20     0 OBJECT  LOCAL  DEFAULT   21 __JCR_LIST__
    30: 0000000000400460     0 FUNC    LOCAL  DEFAULT   14 deregister_tm_clones
    31: 0000000000400490     0 FUNC    LOCAL  DEFAULT   14 register_tm_clones
    32: 00000000004004d0     0 FUNC    LOCAL  DEFAULT   14 __do_global_dtors_aux
    33: 000000000060102c     1 OBJECT  LOCAL  DEFAULT   26 completed.6355
    34: 0000000000600e18     0 OBJECT  LOCAL  DEFAULT   20 __do_global_dtors_aux_fin
    35: 00000000004004f0     0 FUNC    LOCAL  DEFAULT   14 frame_dummy
    36: 0000000000600e10     0 OBJECT  LOCAL  DEFAULT   19 __frame_dummy_init_array_
    37: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS test.c
    38: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS crtstuff.c
    39: 0000000000400708     0 OBJECT  LOCAL  DEFAULT   18 __FRAME_END__
    40: 0000000000600e20     0 OBJECT  LOCAL  DEFAULT   21 __JCR_END__
    41: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS 
    42: 0000000000600e18     0 NOTYPE  LOCAL  DEFAULT   19 __init_array_end
    43: 0000000000600e28     0 OBJECT  LOCAL  DEFAULT   22 _DYNAMIC
    44: 0000000000600e10     0 NOTYPE  LOCAL  DEFAULT   19 __init_array_start
    45: 00000000004005e0     0 NOTYPE  LOCAL  DEFAULT   17 __GNU_EH_FRAME_HDR
    46: 0000000000601000     0 OBJECT  LOCAL  DEFAULT   24 _GLOBAL_OFFSET_TABLE_
    47: 00000000004005b0     2 FUNC    GLOBAL DEFAULT   14 __libc_csu_fini
    48: 0000000000601028     0 NOTYPE  WEAK   DEFAULT   25 data_start
    49: 000000000060102c     0 NOTYPE  GLOBAL DEFAULT   25 _edata
    50: 00000000004005b4     0 FUNC    GLOBAL DEFAULT   15 _fini
    51: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND printf@@GLIBC_2.2.5
    52: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __libc_start_main@@GLIBC_
    53: 0000000000601028     0 NOTYPE  GLOBAL DEFAULT   25 __data_start
    54: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND __gmon_start__
    55: 00000000004005c8     0 OBJECT  GLOBAL HIDDEN    16 __dso_handle
    56: 00000000004005c0     4 OBJECT  GLOBAL DEFAULT   16 _IO_stdin_used
    57: 0000000000400540   101 FUNC    GLOBAL DEFAULT   14 __libc_csu_init
    58: 0000000000601030     0 NOTYPE  GLOBAL DEFAULT   26 _end
    59: 0000000000400430     0 FUNC    GLOBAL DEFAULT   14 _start
    60: 000000000060102c     0 NOTYPE  GLOBAL DEFAULT   26 __bss_start
    61: 000000000040051d    31 FUNC    GLOBAL DEFAULT   14 main
    62: 0000000000601030     0 OBJECT  GLOBAL HIDDEN    25 __TMC_END__
    63: 00000000004003c8     0 FUNC    GLOBAL DEFAULT   11 _init

Version symbols section '.gnu.version' contains 4 entries:
 Addr: 0000000000400358  Offset: 0x000358  Link: 5 (.dynsym)
  000:   0 (*local*)       2 (GLIBC_2.2.5)   2 (GLIBC_2.2.5)   0 (*local*)    

Version needs section '.gnu.version_r' contains 1 entries:
 Addr: 0x0000000000400360  Offset: 0x000360  Link: 6 (.dynstr)
  000000: Version: 1  File: libc.so.6  Cnt: 1
  0x0010:   Name: GLIBC_2.2.5  Flags: none  Version: 2

Displaying notes found at file offset 0x00000254 with length 0x00000020:
  Owner                 Data size	Description
  GNU                  0x00000010	NT_GNU_ABI_TAG (ABI version tag)
    OS: Linux, ABI: 2.6.32

Displaying notes found at file offset 0x00000274 with length 0x00000024:
  Owner                 Data size	Description
  GNU                  0x00000014	NT_GNU_BUILD_ID (unique build ID bitstring)
    Build ID: 4d903e1d996808c09e73c09ebeb0335e806b13db

objdump -d a.out

a.out:     file format elf64-x86-64


Disassembly of section .init:

00000000004003c8 <_init>:
  4003c8:	48 83 ec 08          	sub    $0x8,%rsp
  4003cc:	48 8b 05 25 0c 20 00 	mov    0x200c25(%rip),%rax        # 600ff8 <__gmon_start__>
  4003d3:	48 85 c0             	test   %rax,%rax
  4003d6:	74 05                	je     4003dd <_init+0x15>
  4003d8:	e8 43 00 00 00       	callq  400420 <.plt.got>
  4003dd:	48 83 c4 08          	add    $0x8,%rsp
  4003e1:	c3                   	retq   

Disassembly of section .plt:

00000000004003f0 <.plt>:
  4003f0:	ff 35 12 0c 20 00    	pushq  0x200c12(%rip)        # 601008 <_GLOBAL_OFFSET_TABLE_+0x8>
  4003f6:	ff 25 14 0c 20 00    	jmpq   *0x200c14(%rip)        # 601010 <_GLOBAL_OFFSET_TABLE_+0x10>
  4003fc:	0f 1f 40 00          	nopl   0x0(%rax)

0000000000400400 <printf@plt>:
  400400:	ff 25 12 0c 20 00    	jmpq   *0x200c12(%rip)        # 601018 <printf@GLIBC_2.2.5>
  400406:	68 00 00 00 00       	pushq  $0x0
  40040b:	e9 e0 ff ff ff       	jmpq   4003f0 <.plt>

0000000000400410 <__libc_start_main@plt>:
  400410:	ff 25 0a 0c 20 00    	jmpq   *0x200c0a(%rip)        # 601020 <__libc_start_main@GLIBC_2.2.5>
  400416:	68 01 00 00 00       	pushq  $0x1
  40041b:	e9 d0 ff ff ff       	jmpq   4003f0 <.plt>

Disassembly of section .plt.got:

0000000000400420 <.plt.got>:
  400420:	ff 25 d2 0b 20 00    	jmpq   *0x200bd2(%rip)        # 600ff8 <__gmon_start__>
  400426:	66 90                	xchg   %ax,%ax

Disassembly of section .text:

0000000000400430 <_start>:
  400430:	31 ed                	xor    %ebp,%ebp
  400432:	49 89 d1             	mov    %rdx,%r9
  400435:	5e                   	pop    %rsi
  400436:	48 89 e2             	mov    %rsp,%rdx
  400439:	48 83 e4 f0          	and    $0xfffffffffffffff0,%rsp
  40043d:	50                   	push   %rax
  40043e:	54                   	push   %rsp
  40043f:	49 c7 c0 b0 05 40 00 	mov    $0x4005b0,%r8
  400446:	48 c7 c1 40 05 40 00 	mov    $0x400540,%rcx
  40044d:	48 c7 c7 1d 05 40 00 	mov    $0x40051d,%rdi
  400454:	e8 b7 ff ff ff       	callq  400410 <__libc_start_main@plt>
  400459:	f4                   	hlt    
  40045a:	66 0f 1f 44 00 00    	nopw   0x0(%rax,%rax,1)

0000000000400460 <deregister_tm_clones>:
  400460:	b8 37 10 60 00       	mov    $0x601037,%eax
  400465:	55                   	push   %rbp
  400466:	48 2d 30 10 60 00    	sub    $0x601030,%rax
  40046c:	48 83 f8 0e          	cmp    $0xe,%rax
  400470:	48 89 e5             	mov    %rsp,%rbp
  400473:	77 02                	ja     400477 <deregister_tm_clones+0x17>
  400475:	5d                   	pop    %rbp
  400476:	c3                   	retq   
  400477:	b8 00 00 00 00       	mov    $0x0,%eax
  40047c:	48 85 c0             	test   %rax,%rax
  40047f:	74 f4                	je     400475 <deregister_tm_clones+0x15>
  400481:	5d                   	pop    %rbp
  400482:	bf 30 10 60 00       	mov    $0x601030,%edi
  400487:	ff e0                	jmpq   *%rax
  400489:	0f 1f 80 00 00 00 00 	nopl   0x0(%rax)

0000000000400490 <register_tm_clones>:
  400490:	b8 30 10 60 00       	mov    $0x601030,%eax
  400495:	55                   	push   %rbp
  400496:	48 2d 30 10 60 00    	sub    $0x601030,%rax
  40049c:	48 c1 f8 03          	sar    $0x3,%rax
  4004a0:	48 89 e5             	mov    %rsp,%rbp
  4004a3:	48 89 c2             	mov    %rax,%rdx
  4004a6:	48 c1 ea 3f          	shr    $0x3f,%rdx
  4004aa:	48 01 d0             	add    %rdx,%rax
  4004ad:	48 d1 f8             	sar    %rax
  4004b0:	75 02                	jne    4004b4 <register_tm_clones+0x24>
  4004b2:	5d                   	pop    %rbp
  4004b3:	c3                   	retq   
  4004b4:	ba 00 00 00 00       	mov    $0x0,%edx
  4004b9:	48 85 d2             	test   %rdx,%rdx
  4004bc:	74 f4                	je     4004b2 <register_tm_clones+0x22>
  4004be:	5d                   	pop    %rbp
  4004bf:	48 89 c6             	mov    %rax,%rsi
  4004c2:	bf 30 10 60 00       	mov    $0x601030,%edi
  4004c7:	ff e2                	jmpq   *%rdx
  4004c9:	0f 1f 80 00 00 00 00 	nopl   0x0(%rax)

00000000004004d0 <__do_global_dtors_aux>:
  4004d0:	80 3d 55 0b 20 00 00 	cmpb   $0x0,0x200b55(%rip)        # 60102c <_edata>
  4004d7:	75 11                	jne    4004ea <__do_global_dtors_aux+0x1a>
  4004d9:	55                   	push   %rbp
  4004da:	48 89 e5             	mov    %rsp,%rbp
  4004dd:	e8 7e ff ff ff       	callq  400460 <deregister_tm_clones>
  4004e2:	5d                   	pop    %rbp
  4004e3:	c6 05 42 0b 20 00 01 	movb   $0x1,0x200b42(%rip)        # 60102c <_edata>
  4004ea:	f3 c3                	repz retq 
  4004ec:	0f 1f 40 00          	nopl   0x0(%rax)

00000000004004f0 <frame_dummy>:
  4004f0:	48 83 3d 28 09 20 00 	cmpq   $0x0,0x200928(%rip)        # 600e20 <__JCR_END__>
  4004f7:	00 
  4004f8:	74 1e                	je     400518 <frame_dummy+0x28>
  4004fa:	b8 00 00 00 00       	mov    $0x0,%eax
  4004ff:	48 85 c0             	test   %rax,%rax
  400502:	74 14                	je     400518 <frame_dummy+0x28>
  400504:	55                   	push   %rbp
  400505:	bf 20 0e 60 00       	mov    $0x600e20,%edi
  40050a:	48 89 e5             	mov    %rsp,%rbp
  40050d:	ff d0                	callq  *%rax
  40050f:	5d                   	pop    %rbp
  400510:	e9 7b ff ff ff       	jmpq   400490 <register_tm_clones>
  400515:	0f 1f 00             	nopl   (%rax)
  400518:	e9 73 ff ff ff       	jmpq   400490 <register_tm_clones>

000000000040051d <main>:
  40051d:	55                   	push   %rbp
  40051e:	48 89 e5             	mov    %rsp,%rbp
  400521:	be d0 05 40 00       	mov    $0x4005d0,%esi
  400526:	bf dd 05 40 00       	mov    $0x4005dd,%edi
  40052b:	b8 00 00 00 00       	mov    $0x0,%eax
  400530:	e8 cb fe ff ff       	callq  400400 <printf@plt>
  400535:	b8 01 00 00 00       	mov    $0x1,%eax
  40053a:	5d                   	pop    %rbp
  40053b:	c3                   	retq   
  40053c:	0f 1f 40 00          	nopl   0x0(%rax)

0000000000400540 <__libc_csu_init>:
  400540:	41 57                	push   %r15
  400542:	41 89 ff             	mov    %edi,%r15d
  400545:	41 56                	push   %r14
  400547:	49 89 f6             	mov    %rsi,%r14
  40054a:	41 55                	push   %r13
  40054c:	49 89 d5             	mov    %rdx,%r13
  40054f:	41 54                	push   %r12
  400551:	4c 8d 25 b8 08 20 00 	lea    0x2008b8(%rip),%r12        # 600e10 <__frame_dummy_init_array_entry>
  400558:	55                   	push   %rbp
  400559:	48 8d 2d b8 08 20 00 	lea    0x2008b8(%rip),%rbp        # 600e18 <__init_array_end>
  400560:	53                   	push   %rbx
  400561:	4c 29 e5             	sub    %r12,%rbp
  400564:	31 db                	xor    %ebx,%ebx
  400566:	48 c1 fd 03          	sar    $0x3,%rbp
  40056a:	48 83 ec 08          	sub    $0x8,%rsp
  40056e:	e8 55 fe ff ff       	callq  4003c8 <_init>
  400573:	48 85 ed             	test   %rbp,%rbp
  400576:	74 1e                	je     400596 <__libc_csu_init+0x56>
  400578:	0f 1f 84 00 00 00 00 	nopl   0x0(%rax,%rax,1)
  40057f:	00 
  400580:	4c 89 ea             	mov    %r13,%rdx
  400583:	4c 89 f6             	mov    %r14,%rsi
  400586:	44 89 ff             	mov    %r15d,%edi
  400589:	41 ff 14 dc          	callq  *(%r12,%rbx,8)
  40058d:	48 83 c3 01          	add    $0x1,%rbx
  400591:	48 39 eb             	cmp    %rbp,%rbx
  400594:	75 ea                	jne    400580 <__libc_csu_init+0x40>
  400596:	48 83 c4 08          	add    $0x8,%rsp
  40059a:	5b                   	pop    %rbx
  40059b:	5d                   	pop    %rbp
  40059c:	41 5c                	pop    %r12
  40059e:	41 5d                	pop    %r13
  4005a0:	41 5e                	pop    %r14
  4005a2:	41 5f                	pop    %r15
  4005a4:	c3                   	retq   
  4005a5:	90                   	nop
  4005a6:	66 2e 0f 1f 84 00 00 	nopw   %cs:0x0(%rax,%rax,1)
  4005ad:	00 00 00 

00000000004005b0 <__libc_csu_fini>:
  4005b0:	f3 c3                	repz retq 

Disassembly of section .fini:

00000000004005b4 <_fini>:
  4005b4:	48 83 ec 08          	sub    $0x8,%rsp
  4005b8:	48 83 c4 08          	add    $0x8,%rsp
  4005bc:	c3                   	retq

引用外部函数,但是不链接 gcc -c test.c ,生成可执行文件,就得连接因为这里的sum只是声明并没有实现所以无法生成可执行文件。

#include<stdio.h>

extern int sum(int a, int b);

int main() {
  printf("%d", sum(1, 2));
  printf("%s", "hello world!");
  return 1;
}

这里我们使用命令来查看ELF文件格式 readelf -a test.o

ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              REL (Relocatable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          0 (bytes into file)
  Start of section headers:          832 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           0 (bytes)
  Number of program headers:         0
  Size of section headers:           64 (bytes)
  Number of section headers:         13
  Section header string table index: 12

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .text             PROGBITS         0000000000000000  00000040
       000000000000003f  0000000000000000  AX       0     0     1
  [ 2] .rela.text        RELA             0000000000000000  00000230
       0000000000000090  0000000000000018   I      10     1     8
  [ 3] .data             PROGBITS         0000000000000000  0000007f
       0000000000000000  0000000000000000  WA       0     0     1
  [ 4] .bss              NOBITS           0000000000000000  0000007f
       0000000000000000  0000000000000000  WA       0     0     1
  [ 5] .rodata           PROGBITS         0000000000000000  0000007f
       0000000000000013  0000000000000000   A       0     0     1
  [ 6] .comment          PROGBITS         0000000000000000  00000092
       000000000000002e  0000000000000001  MS       0     0     1
  [ 7] .note.GNU-stack   PROGBITS         0000000000000000  000000c0
       0000000000000000  0000000000000000           0     0     1
  [ 8] .eh_frame         PROGBITS         0000000000000000  000000c0
       0000000000000038  0000000000000000   A       0     0     8
  [ 9] .rela.eh_frame    RELA             0000000000000000  000002c0
       0000000000000018  0000000000000018   I      10     8     8
  [10] .symtab           SYMTAB           0000000000000000  000000f8
       0000000000000120  0000000000000018          11     9     8
  [11] .strtab           STRTAB           0000000000000000  00000218
       0000000000000018  0000000000000000           0     0     1
  [12] .shstrtab         STRTAB           0000000000000000  000002d8
       0000000000000061  0000000000000000           0     0     1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  l (large), p (processor specific)

There are no section groups in this file.

There are no program headers in this file.

Relocation section '.rela.text' at offset 0x230 contains 6 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
00000000000f  000a00000002 R_X86_64_PC32     0000000000000000 sum - 4
000000000016  00050000000a R_X86_64_32       0000000000000000 .rodata + 0
000000000020  000b00000002 R_X86_64_PC32     0000000000000000 printf - 4
000000000025  00050000000a R_X86_64_32       0000000000000000 .rodata + 3
00000000002a  00050000000a R_X86_64_32       0000000000000000 .rodata + 10
000000000034  000b00000002 R_X86_64_PC32     0000000000000000 printf - 4

Relocation section '.rela.eh_frame' at offset 0x2c0 contains 1 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000020  000200000002 R_X86_64_PC32     0000000000000000 .text + 0

The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported.

Symbol table '.symtab' contains 12 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS data.c
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1 
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    3 
     4: 0000000000000000     0 SECTION LOCAL  DEFAULT    4 
     5: 0000000000000000     0 SECTION LOCAL  DEFAULT    5 
     6: 0000000000000000     0 SECTION LOCAL  DEFAULT    7 
     7: 0000000000000000     0 SECTION LOCAL  DEFAULT    8 
     8: 0000000000000000     0 SECTION LOCAL  DEFAULT    6 
     9: 0000000000000000    63 FUNC    GLOBAL DEFAULT    1 main
    10: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND sum
    11: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND printf

使用 objdump -d test.o

Disassembly of section .text:
// 左边是真实的指令,右边是符号
0000000000000000 <main>:
   0:	55                   	push   %rbp
   1:	48 89 e5             	mov    %rsp,%rbp
   4:	48 83 ec 10          	sub    $0x10,%rsp
   8:	be 02 00 00 00       	mov    $0x2,%esi
   d:	bf 01 00 00 00       	mov    $0x1,%edi
  12:	b8 00 00 00 00       	mov    $0x0,%eax
  17:	e8 00 00 00 00       	callq  1c <main+0x1c>
  1c:	89 45 fc             	mov    %eax,-0x4(%rbp)
  1f:	b8 01 00 00 00       	mov    $0x1,%eax
  24:	c9                   	leaveq 
  25:	c3                   	retq

生成符号表,如果我们将其中一个符号表删除,他是不是就找不到映射关系了。就无法进行链接了。

使用 strip sum.o 命令即可删除符号表。

这就是静态链接,我从另一个符号表中,找到我需要的函数标记,将对应的地址找到,替换我的符号。

#include<stdio.h>

int main() {
  int a = sum(1, 2);
  return 1;
}
int sum(int a, int b) {
  return a + b;
}

gcc -c test.c

Relocation section '.rela.text' at offset 0x1d8 contains 1 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000018  000900000002 R_X86_64_PC32     0000000000000000 sum - 4

Relocation section '.rela.eh_frame' at offset 0x1f0 contains 1 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000020  000200000002 R_X86_64_PC32     0000000000000000 .text + 0

gcc -c test.c sum.o

Relocation section '.rela.dyn' at offset 0x360 contains 1 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000600ff8  000200000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0

Relocation section '.rela.plt' at offset 0x378 contains 1 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000601018  000100000007 R_X86_64_JUMP_SLO 0000000000000000 __libc_start_main@GLIBC_2.2.5 + 0

编译成可执行文件之后,我们发现 sum 的符号没有了

a.out : 可执行文件
00000000004004cd <main>:
  4004cd:	55                   	push   %rbp
  4004ce:	48 89 e5             	mov    %rsp,%rbp
  4004d1:	48 83 ec 10          	sub    $0x10,%rsp
  4004d5:	be 02 00 00 00       	mov    $0x2,%esi
  4004da:	bf 01 00 00 00       	mov    $0x1,%edi
  4004df:	b8 00 00 00 00       	mov    $0x0,%eax
  4004e4:	e8 0a 00 00 00       	callq  4004f3 <sum> // 经过链接后变为地址
  4004e9:	89 45 fc             	mov    %eax,-0x4(%rbp)
  4004ec:	b8 01 00 00 00       	mov    $0x1,%eax
  4004f1:	c9                   	leaveq 
  4004f2:	c3                   	retq   

test.o 文件 可链接文件
0000000000000000 <main>:
   0:	55                   	push   %rbp
   1:	48 89 e5             	mov    %rsp,%rbp
   4:	48 83 ec 10          	sub    $0x10,%rsp
   8:	be 02 00 00 00       	mov    $0x2,%esi
   d:	bf 01 00 00 00       	mov    $0x1,%edi
  12:	b8 00 00 00 00       	mov    $0x0,%eax
  17:	e8 00 00 00 00       	callq  1c <main+0x1c>  // 这是sum函数
  1c:	89 45 fc             	mov    %eax,-0x4(%rbp)
  1f:	b8 01 00 00 00       	mov    $0x1,%eax
  24:	c9                   	leaveq 
  25:	c3                   	retq   

静态链接的时候会在编译的时候就将地址重定位。就需要将所有的文件都聚合起来。在更换机器的之后直接可以运行起来。

如果我们有两个相同的程序,都使用相同的代码,这就会导致内存中有亢余代码,那我们想着是不是可以两个程序公用以一段代码。所以就出现了动态链接器。

因为两个程序使用的是虚拟地址,而并不是物理地址,所以公有代码对于两个程序来说,需要的是虚拟地址,而不能是物理地址。如何获取到公有代码的地址端,根据当前执行的指令地址再去加上偏移量得到公有代码的物理地址。每个程序对应的偏移量都不一致,是根据指令计算而来的。

gcc -shared -o libmysum.so sum.c

gcc data.o libmysum.so 编译

./a.out 运行,发现报错,当前运行时库,找不到。

./a.out: error while loading shared libraries: libmysum.so: cannot open shared object file: No such file or directory

编译时,-fPIC 代码与地址无关。

/usr/bin/ld: /tmp/cceZUIwy.o: relocation R_X86_64_32 against `.rodata' can not be used when making a shared object; recompile with -fPIC

/usr/bin/ld: final link failed: Nonrepresentable section on output

collect2: error: ld returned 1 exit status

这里为了让他找到这个库,需要将他加入到环境变量中。

LD_LIBRARY_PATH

程序加载运行期间查找动态链接库的路径

LIBRARY_PATH

程序编译期间查找动态链接库时的路径

export LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH。.表示当前文件目录下。

这样就可以运行了

ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              DYN (Shared object file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x530
  Start of program headers:          64 (bytes into file)
  Start of section headers:          6128 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         7
  Size of section headers:           64 (bytes)
  Number of section headers:         27
  Section header string table index: 26

Relocation section '.rela.dyn' at offset 0x430 contains 8 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000200e28  000000000008 R_X86_64_RELATIVE                    5e0
000000200e30  000000000008 R_X86_64_RELATIVE                    5a0
000000200e40  000000000008 R_X86_64_RELATIVE                    200e40
000000200fd8  000100000006 R_X86_64_GLOB_DAT 0000000000000000 _ITM_deregisterTMClone + 0
000000200fe0  000200000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0
000000200fe8  000300000006 R_X86_64_GLOB_DAT 0000000000000000 _Jv_RegisterClasses + 0
000000200ff0  000400000006 R_X86_64_GLOB_DAT 0000000000000000 _ITM_registerTMCloneTa + 0
000000200ff8  000500000006 R_X86_64_GLOB_DAT 0000000000000000 __cxa_finalize@GLIBC_2.2.5 + 0

The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported.

Symbol table '.dynsym' contains 12 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND _ITM_deregisterTMCloneTab
     2: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND __gmon_start__
     3: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND _Jv_RegisterClasses
     4: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND _ITM_registerTMCloneTable
     5: 0000000000000000     0 FUNC    WEAK   DEFAULT  UND __cxa_finalize@GLIBC_2.2.5 (2)
     6: 0000000000201018     0 NOTYPE  GLOBAL DEFAULT   21 _edata
     7: 0000000000201020     0 NOTYPE  GLOBAL DEFAULT   22 _end
     8: 0000000000000615    20 FUNC    GLOBAL DEFAULT   11 sum
     9: 0000000000201018     0 NOTYPE  GLOBAL DEFAULT   22 __bss_start
    10: 00000000000004f0     0 FUNC    GLOBAL DEFAULT    8 _init
    11: 000000000000062c     0 FUNC    GLOBAL DEFAULT   12 _fini

现在我们看到

ELF有三种文件类型

  1. 可执行文件
  2. 可链接文件
  3. 动态可链接文件

readelf -a a.out

Relocation section '.rela.plt' at offset 0x480 contains 3 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000601018  000100000007 R_X86_64_JUMP_SLO 0000000000000000 printf@GLIBC_2.2.5 + 0
000000601020  000200000007 R_X86_64_JUMP_SLO 0000000000000000 __libc_start_main@GLIBC_2.2.5 + 0
000000601028  000400000007 R_X86_64_JUMP_SLO 0000000000000000 sum + 0
extern int global_data;
static int static_data = 33;
int data = 11;

extern int global_sum(int a, int b);
static int static_sum(int a, int b) {
  return a + b + global_data + static_data + data;
};

int sum(int a, int b) {
  return a + b + global_data + static_data + data;
};

int main() {
  printf("%d", global_data);
  printf("%d", static_data);
  printf("%d", data);
  global_sum(1, 2);
  static_sum(1, 2);
  sum(1, 2);
  return 1;
}

我们发现动态链接器和静态链接器的项不一样。

静态链接器生成的表项是 .rela.text 动态链接器生成的表项是 .rela.dyn

静态链接器找到这些表项的偏移量即可计算出而地址,所以他可以写死,而动态链接器却不行,因为在两个程序操作系统给分配的地址不一样,虽然都叫同一个变量名,但是这个数据是共享的,如果我将这个地址写死,那么怎么去知道这个数据现在归谁去操作,有安全性问题。

我们找到 200fd0 的地址

我们发现他使用了 IP 寄存器(指令指针寄存器)+ 一个偏移地址 从而获取到了这个数据。而相对这个数据是谁放进去的?动态链接器


每个程序都会给分配内存,那么我给整个内存分配的时候预先准备一张表,如果程序需要访问这个数据,你先掉用动态链接器,我给你将这个地址填写上去。

现在我们发现了他是使用偏移量做的相对地址,() 解引用,将 rip 寄存器中存放的地址取出来。

然后我我们找到这个地址,发现他是 GOT 这是一个表项。

动态链接器会将数据的真实的地址存放在GOT中,函数执行到当前的指令解引用就会将其真实地址获取出来,查表,然后即可访问的数据。

现在访问数据的问题解决了。

现在来解决函数的问题。

函数因为是懒加载的方式,在程序运行的时候,使用当前函数的时候才会加载进内存中,那么也就无法使用偏移的方式做函数的动态加载,因为函数没有加载,没法预先知道他的偏移量在哪里。

我们观察到这里调用了一个 PLT 的东西。跳转到相应的地方。640 650 660我们去找到这个地址。

我们发现有这样的表项,640

然后跳转到

又是相同的做法,这次叫做 got.plt

然后将 0x0 压入栈中。这个啥含义呢?

我们发现这里有个表项,三个,从0 - 2 对应的是数组下标。

然后继续跳转到 630 而这三个函数都一起跳转到 同一个地址,这个代码其实就是动态链接库的函数的地址,使用这个函数,这个函数的功能就是 将操作系统懒加载的分配的地址写入到指定的位置上去。

然后这个函数的地址就找到了。完成调用

我们来总结一下:

GOT:当访问的是数据的时候使用 GOT
PLT:当访问的是函数的时候使用 PIT + GOT

我们来梳理流程。

访问变量时:

  1. 因为变量定义在了代码段中,代码段又是只读不可写的缘故,操作系统在分配内存的时候,会开辟一个表,这地方需要是可读可写的地方,这个表叫做GOT。
  2. 这个表存放的位置使用相对偏移,起始地址就是当前函数的地址。然后找到表项。
  3. 而动态链接库会将操作系统对需要访问的数据的真实地址存放到相应的表项中。

访问函数时:

  1. 因为函数是在运行时才会被加载进来,所以操作系统无法对函数使用相对偏移去获取地址。这时操作系统会创建一个表项这个表项是 PLT 表。和一个GOT表。
  2. 代码执行跳转到指定的地址上去,然后去执行 PLT 中代码,然后去获取函数的地址 (GOT表),如果当前这个地址中有函数的地址,就不往下执行了,如果当前地址中没有函数的地址,那么继续向下
  3. 将表项的 index 放入栈中,然后去调用 动态链接库的函数。
  4. 动态链接库会去加载操作系统给函数分配的地址,而动态链接库的函数并将这个地址放到指定的位置上去(GOT表),这样下次访问的时候就不需要再次调用 动态链接库的函数。

这个时候我们来看一个 ELF 的文档。我们来对照一个程序来看。

#include<stdio.h>
int main() {
  printf("%s", "hello,world");
  return 1;
}
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x400430 // 程序的入口
  Start of program headers:          64 (bytes into file)
  Start of section headers:          6456 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         9
  Size of section headers:           64 (bytes)
  Number of section headers:         31
  Section header string table index: 30


Elf file type is EXEC (Executable file)
Entry point 0x400430
There are 9 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  PHDR           0x0000000000000040 0x0000000000400040 0x0000000000400040
                 0x00000000000001f8 0x00000000000001f8  R E    8
  INTERP         0x0000000000000238 0x0000000000400238 0x0000000000400238
                 0x000000000000001c 0x000000000000001c  R      1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                 0x000000000000070c 0x000000000000070c  R E    200000
  LOAD           0x0000000000000e10 0x0000000000600e10 0x0000000000600e10
                 0x000000000000021c 0x0000000000000220  RW     200000
  DYNAMIC        0x0000000000000e28 0x0000000000600e28 0x0000000000600e28
                 0x00000000000001d0 0x00000000000001d0  RW     8
  NOTE           0x0000000000000254 0x0000000000400254 0x0000000000400254
                 0x0000000000000044 0x0000000000000044  R      4
  GNU_EH_FRAME   0x00000000000005e0 0x00000000004005e0 0x00000000004005e0
                 0x0000000000000034 0x0000000000000034  R      4
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     10
  GNU_RELRO      0x0000000000000e10 0x0000000000600e10 0x0000000000600e10
                 0x00000000000001f0 0x00000000000001f0  R      1

 Section to Segment mapping:
  Segment Sections...
   00     
   01     .interp 
   02     .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .plt.got .text .fini .rodata .eh_frame_hdr .eh_frame 
   03     .init_array .fini_array .jcr .dynamic .got .got.plt .data .bss 
   04     .dynamic 
   05     .note.ABI-tag .note.gnu.build-id 
   06     .eh_frame_hdr 
   07     
   08     .init_array .fini_array .jcr .dynamic .got 

对比文档我们看到

代码的入口是

Entry point address: 0x400430 // 程序的入口

LOAD 0x0000000000000000 0x0000000000400000

为什么程序的起始地址在 0x0000000000400000

然后根据地址顺序排序:

那么当前代码的入口函数在哪里?

main 函数地址在 40051d 很明显这里不是起始地址。

Entry point address: 0x400430 // 代码的起始地址在这里

得出结论,这段代码是链接器生成的,而 mian 是一个回调函数。

/* This is the canonical entry point, usually the first thing in the text
   segment.  The SVR4/i386 ABI (pages 3-31, 3-32) says that when the entry
   point runs, most registers' values are unspecified, except for:

   %edx		Contains a function pointer to be registered with `atexit'.
		This is how the dynamic linker arranges to have DT_FINI
		functions called for shared libraries that have been loaded
		before this code runs.

   %esp		The stack contains the arguments and environment:  
   // main 函数入参
		0(%esp)			argc
		4(%esp)			argv[0]
		...
		(4*argc)(%esp)		NULL
		(4*(argc+1))(%esp)	envp[0]
		...
					NULL
*/

SVR4/i386 ABI (pages 3-31, 3-32)

ABI 的文档

当调用时,首先执行 .init , 当函数执行完成之后 调用 .finit

// 
execve("./a.out", ["./a.out"], 0x7fff77b31400 /* 25 vars */) = 0
brk(NULL)                               = 0x76f000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4d5535a000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=22025, ...}) = 0
mmap(NULL, 22025, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f4d55354000
close(3)                                = 0
open("/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0`&\2\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=2156592, ...}) = 0
mmap(NULL, 3985920, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f4d54d6c000
mprotect(0x7f4d54f30000, 2093056, PROT_NONE) = 0
mmap(0x7f4d5512f000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1c3000) = 0x7f4d5512f000
mmap(0x7f4d55135000, 16896, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f4d55135000
close(3)                                = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4d55353000
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4d55351000
arch_prctl(ARCH_SET_FS, 0x7f4d55351740) = 0
access("/etc/sysconfig/strcasecmp-nonascii", F_OK) = -1 ENOENT (No such file or directory)
access("/etc/sysconfig/strcasecmp-nonascii", F_OK) = -1 ENOENT (No such file or directory)
mprotect(0x7f4d5512f000, 16384, PROT_READ) = 0
mprotect(0x600000, 4096, PROT_READ)     = 0
mprotect(0x7f4d5535b000, 4096, PROT_READ) = 0
munmap(0x7f4d55354000, 22025)           = 0
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4d55359000
write(1, "./a.out", 7./a.out)                  = 7
exit_group(1)                           = ?
+++ exited with 1 +++
#include<stdio.h>

int main(int agrc, int **argv) {
  printf("%s", argv[0]);
  return 1;
}

我们就来看一个 gcc 的源码

sysdeps\sh\start.S

csu\libc-start.c

STATIC int
LIBC_START_MAIN (int (*main) (int, char **, char ** MAIN_AUXVEC_DECL),
		 int argc, char **argv,
#ifdef LIBC_START_MAIN_AUXVEC_ARG
		 ElfW(auxv_t) *auxvec,
#endif   // init 函数定义        
		 __typeof (main) init,
		 void (*fini) (void),
		 void (*rtld_fini) (void), void *stack_end)
{
#ifndef SHARED
  char **ev = &argv[argc + 1];

  __environ = ev;

  /* Store the lowest stack address.  This is done in ld.so if this is
     the code for the DSO.  */
  __libc_stack_end = stack_end;

# ifdef HAVE_AUX_VECTOR
  /* First process the auxiliary vector since we need to find the
     program header to locate an eventually present PT_TLS entry.  */
#  ifndef LIBC_START_MAIN_AUXVEC_ARG
  ElfW(auxv_t) *auxvec;
  {
    char **evp = ev;
    while (*evp++ != NULL)
      ;
    auxvec = (ElfW(auxv_t) *) evp;
  }
#  endif
  _dl_aux_init (auxvec);
  if (GL(dl_phdr) == NULL)
# endif
    {
      /* Starting from binutils-2.23, the linker will define the
         magic symbol __ehdr_start to point to our own ELF header
         if it is visible in a segment that also includes the phdrs.
         So we can set up _dl_phdr and _dl_phnum even without any
         information from auxv.  */

      extern const ElfW(Ehdr) __ehdr_start
# if BUILD_PIE_DEFAULT
	__attribute__ ((visibility ("hidden")));
# else
	__attribute__ ((weak, visibility ("hidden")));
      if (&__ehdr_start != NULL)
# endif
        {
          assert (__ehdr_start.e_phentsize == sizeof *GL(dl_phdr));
          GL(dl_phdr) = (const void *) &__ehdr_start + __ehdr_start.e_phoff;
          GL(dl_phnum) = __ehdr_start.e_phnum;
        }
    }

  __tunables_init (__environ);

  ARCH_INIT_CPU_FEATURES ();

  /* Do static pie self relocation after tunables and cpu features
     are setup for ifunc resolvers. Before this point relocations
     must be avoided.  */
  _dl_relocate_static_pie ();

  /* Perform IREL{,A} relocations.  */
  ARCH_SETUP_IREL ();

  /* The stack guard goes into the TCB, so initialize it early.  */
  ARCH_SETUP_TLS ();

  /* In some architectures, IREL{,A} relocations happen after TLS setup in
     order to let IFUNC resolvers benefit from TCB information, e.g. powerpc's
     hwcap and platform fields available in the TCB.  */
  ARCH_APPLY_IREL ();

  /* Set up the stack checker's canary.  */
  uintptr_t stack_chk_guard = _dl_setup_stack_chk_guard (_dl_random);
# ifdef THREAD_SET_STACK_GUARD
  THREAD_SET_STACK_GUARD (stack_chk_guard);
# else
  __stack_chk_guard = stack_chk_guard;
# endif

  /* Initialize libpthread if linked in.  */
  if (__pthread_initialize_minimal != NULL)
    __pthread_initialize_minimal ();

  /* Set up the pointer guard value.  */
  uintptr_t pointer_chk_guard = _dl_setup_pointer_guard (_dl_random,
							 stack_chk_guard);
# ifdef THREAD_SET_POINTER_GUARD
  THREAD_SET_POINTER_GUARD (pointer_chk_guard);
# else
  __pointer_chk_guard_local = pointer_chk_guard;
# endif

#endif /* !SHARED  */

  /* Register the destructor of the dynamic linker if there is any.  */
  if (__glibc_likely (rtld_fini != NULL))
    __cxa_atexit ((void (*) (void *)) rtld_fini, NULL, NULL);

#ifndef SHARED
  /* Perform early initialization.  In the shared case, this function
     is called from the dynamic loader as early as possible.  */
  __libc_early_init (true);

  /* Call the initializer of the libc.  This is only needed here if we
     are compiling for the static library in which case we haven't
     run the constructors in `_dl_start_user'.  */
  __libc_init_first (argc, argv, __environ);

  /* Register the destructor of the statically-linked program.  */
  __cxa_atexit (call_fini, NULL, NULL);

  /* Some security at this point.  Prevent starting a SUID binary where
     the standard file descriptors are not opened.  We have to do this
     only for statically linked applications since otherwise the dynamic
     loader did the work already.  */
  if (__builtin_expect (__libc_enable_secure, 0))
    __libc_check_standard_fds ();
#endif /* !SHARED */

  /* Call the initializer of the program, if any.  */
#ifdef SHARED
  if (__builtin_expect (GLRO(dl_debug_mask) & DL_DEBUG_IMPCALLS, 0))
    GLRO(dl_debug_printf) ("\ninitialize program: %s\n\n", argv[0]);

  if (init != NULL)
    /* This is a legacy program which supplied its own init
       routine.  */
     // 执行 init 函数 
    (*init) (argc, argv, __environ MAIN_AUXVEC_PARAM);
  else
    /* This is a current program.  Use the dynamic segment to find
       constructors.  */
    call_init (argc, argv, __environ);

  /* Auditing checkpoint: we have a new object.  */
  _dl_audit_preinit (GL(dl_ns)[LM_ID_BASE]._ns_loaded);

  if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_IMPCALLS))
    GLRO(dl_debug_printf) ("\ntransferring control: %s\n\n", argv[0]);
#else /* !SHARED */
  call_init (argc, argv, __environ);

  _dl_debug_initialize (0, LM_ID_BASE);
#endif

  __libc_start_call_main (main, argc, argv MAIN_AUXVEC_PARAM);
}



__libc_start_call_main (int (*main) (int, char **, char ** MAIN_AUXVEC_DECL),
                        int argc, char **argv
#ifdef LIBC_START_MAIN_AUXVEC_ARG
                            , ElfW(auxv_t) *auxvec
#endif
                        )
{
  int result;

  /* Memory for the cancellation buffer.  */
  struct pthread_unwind_buf unwind_buf;

  int not_first_call;
  DIAG_PUSH_NEEDS_COMMENT;
#if __GNUC_PREREQ (7, 0)
  /* This call results in a -Wstringop-overflow warning because struct
     pthread_unwind_buf is smaller than jmp_buf.  setjmp and longjmp
     do not use anything beyond the common prefix (they never access
     the saved signal mask), so that is a false positive.  */
  DIAG_IGNORE_NEEDS_COMMENT (11, "-Wstringop-overflow=");
#endif
  not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf);
  DIAG_POP_NEEDS_COMMENT;
  if (__glibc_likely (! not_first_call))
    {
      struct pthread *self = THREAD_SELF;

      /* Store old info.  */
      unwind_buf.priv.data.prev = THREAD_GETMEM (self, cleanup_jmp_buf);
      unwind_buf.priv.data.cleanup = THREAD_GETMEM (self, cleanup);

      /* Store the new cleanup handler info.  */
      THREAD_SETMEM (self, cleanup_jmp_buf, &unwind_buf);

      /* Run the program.  */
      // 运行 main 函数
      result = main (argc, argv, __environ MAIN_AUXVEC_PARAM);
    }
  else
    {
      /* Remove the thread-local data.  */
      __nptl_deallocate_tsd ();

      /* One less thread.  Decrement the counter.  If it is zero we
         terminate the entire process.  */
      result = 0;
      if (! atomic_decrement_and_test (&__nptl_nthreads))
        /* Not much left to do but to exit the thread, not the process.  */
	while (1)
	  INTERNAL_SYSCALL_CALL (exit, 0);
    }

  exit (result);
}

ELF 规范了一个程序的内存布局,操作系统只需要根据ELF规划,加载即可。

然后加载到入口地址,开始执行第一个函数 _start 然后去加载动态链接库中的 __libc_start_main 函数,然后在__libc_start_main 函数中调用了_init 函数,在_init 函数中又继续调用 __gmon_start__,而在 __libc_start_main中也调用了 main 函数

而我们看 JVM 虚拟机是怎么样执行过程。

首先通过 javac 生成 IR 中间语言,然后将中间语言 IR 编译成 机器语言,先 IR 想到机器语言去运行,需要一个链接器 而 JVM 就是干这个事的。而 JVM 是一个动态链接库,他需要运行,则需要一个程序去加载他,那么 java 就是这个程序

那我们就来验证一下。使用 JDK源码

/*
 * Entry point.
 */
int
JLI_Launch(int argc, char ** argv,              /* main argc, argc */
        int jargc, const char** jargv,          /* java args */
        int appclassc, const char** appclassv,  /* app classpath */
        const char* fullversion,                /* full version defined */
        const char* dotversion,                 /* dot version defined */
        const char* pname,                      /* program name */
        const char* lname,                      /* launcher name */
        jboolean javaargs,                      /* JAVA_ARGS */
        jboolean cpwildcard,                    /* classpath wildcard*/
        jboolean javaw,                         /* windows-only javaw */
        jint ergo                               /* ergonomics class policy */
)
{
    int mode = LM_UNKNOWN;
    char *what = NULL;
    char *cpath = 0;
    char *main_class = NULL;
    int ret;
    InvocationFunctions ifn;
    jlong start, end;
    char jvmpath[MAXPATHLEN];
    char jrepath[MAXPATHLEN];
    char jvmcfg[MAXPATHLEN];

    _fVersion = fullversion;
    _dVersion = dotversion;
    _launcher_name = lname;
    _program_name = pname;
    _is_java_args = javaargs;
    _wc_enabled = cpwildcard;
    _ergo_policy = ergo;

    InitLauncher(javaw);
    DumpState();
    if (JLI_IsTraceLauncher()) {
        int i;
        printf("Command line args:\n");
        for (i = 0; i < argc ; i++) {
            printf("argv[%d] = %s\n", i, argv[i]);
        }
        AddOption("-Dsun.java.launcher.diag=true", NULL);
    }

    /*
     * Make sure the specified version of the JRE is running.
     *
     * There are three things to note about the SelectVersion() routine:
     *  1) If the version running isn't correct, this routine doesn't
     *     return (either the correct version has been exec'd or an error
     *     was issued).
     *  2) Argc and Argv in this scope are *not* altered by this routine.
     *     It is the responsibility of subsequent code to ignore the
     *     arguments handled by this routine.
     *  3) As a side-effect, the variable "main_class" is guaranteed to
     *     be set (if it should ever be set).  This isn't exactly the
     *     poster child for structured programming, but it is a small
     *     price to pay for not processing a jar file operand twice.
     *     (Note: This side effect has been disabled.  See comment on
     *     bugid 5030265 below.)
     */
    SelectVersion(argc, argv, &main_class);

    CreateExecutionEnvironment(&argc, &argv,
                               jrepath, sizeof(jrepath),
                               jvmpath, sizeof(jvmpath),
                               jvmcfg,  sizeof(jvmcfg));

    ifn.CreateJavaVM = 0;
    ifn.GetDefaultJavaVMInitArgs = 0;

    if (JLI_IsTraceLauncher()) {
        start = CounterGet();
    }
    // 加载JVM 
    if (!LoadJavaVM(jvmpath, &ifn)) {
        return(6);
    }

    if (JLI_IsTraceLauncher()) {
        end   = CounterGet();
    }

    JLI_TraceLauncher("%ld micro seconds to LoadJavaVM\n",
             (long)(jint)Counter2Micros(end-start));

    ++argv;
    --argc;

    if (IsJavaArgs()) {
        /* Preprocess wrapper arguments */
        TranslateApplicationArgs(jargc, jargv, &argc, &argv);
        if (!AddApplicationOptions(appclassc, appclassv)) {
            return(1);
        }
    } else {
        /* Set default CLASSPATH */
        cpath = getenv("CLASSPATH");
        if (cpath == NULL) {
            cpath = ".";
        }
        SetClassPath(cpath);
    }

    /* Parse command line options; if the return value of
     * ParseArguments is false, the program should exit.
     */
    if (!ParseArguments(&argc, &argv, &mode, &what, &ret, jrepath))
    {
        return(ret);
    }

    /* Override class path if -jar flag was specified */
    if (mode == LM_JAR) {
        SetClassPath(what);     /* Override class path */
    }

    /* set the -Dsun.java.command pseudo property */
    SetJavaCommandLineProp(what, argc, argv);

    /* Set the -Dsun.java.launcher pseudo property */
    SetJavaLauncherProp();

    /* set the -Dsun.java.launcher.* platform properties */
    SetJavaLauncherPlatformProps();

    return JVMInit(&ifn, threadStackSize, argc, argv, mode, what, ret);
}

int
JVMInit(InvocationFunctions* ifn, jlong threadStackSize,
        int argc, char **argv,
        int mode, char *what, int ret)
{
    ShowSplashScreen();
    return ContinueInNewThread(ifn, threadStackSize, argc, argv, mode, what, ret);
}

int
ContinueInNewThread(InvocationFunctions* ifn, jlong threadStackSize,
                    int argc, char **argv,
                    int mode, char *what, int ret)
{

    /*
     * If user doesn't specify stack size, check if VM has a preference.
     * Note that HotSpot no longer supports JNI_VERSION_1_1 but it will
     * return its default stack size through the init args structure.
     */
    if (threadStackSize == 0) {
      struct JDK1_1InitArgs args1_1;
      memset((void*)&args1_1, 0, sizeof(args1_1));
      args1_1.version = JNI_VERSION_1_1;
      ifn->GetDefaultJavaVMInitArgs(&args1_1);  /* ignore return value */
      if (args1_1.javaStackSize > 0) {
         threadStackSize = args1_1.javaStackSize;
      }
    }

    { /* Create a new thread to create JVM and invoke main method */
      JavaMainArgs args;
      int rslt;

      args.argc = argc;
      args.argv = argv;
      args.mode = mode;
      args.what = what;
      args.ifn = *ifn;
      // JavaMain 入口函数
      rslt = ContinueInNewThread0(JavaMain, threadStackSize, (void*)&args);
      /* If the caller has deemed there is an error we
       * simply return that, otherwise we return the value of
       * the callee
       */
      return (ret != 0) ? ret : rslt;
    }
}

jboolean
LoadJavaVM(const char *jvmpath, InvocationFunctions *ifn)
{
    void *libjvm;

    JLI_TraceLauncher("JVM path is %s\n", jvmpath);

    libjvm = dlopen(jvmpath, RTLD_NOW + RTLD_GLOBAL);
    if (libjvm == NULL) {
#if defined(__solaris__) && defined(__sparc) && !defined(_LP64) /* i.e. 32-bit sparc */
      FILE * fp;
      Elf32_Ehdr elf_head;
      int count;
      int location;
      // 打开了动态链接库
      fp = fopen(jvmpath, "r");
      if (fp == NULL) {
        JLI_ReportErrorMessage(DLL_ERROR2, jvmpath, dlerror());
        return JNI_FALSE;
      }

      /* read in elf header */
      count = fread((void*)(&elf_head), sizeof(Elf32_Ehdr), 1, fp);
      fclose(fp);
      if (count < 1) {
        JLI_ReportErrorMessage(DLL_ERROR2, jvmpath, dlerror());
        return JNI_FALSE;
      }

      /*
       * Check for running a server vm (compiled with -xarch=v8plus)
       * on a stock v8 processor.  In this case, the machine type in
       * the elf header would not be included the architecture list
       * provided by the isalist command, which is turn is gotten from
       * sysinfo.  This case cannot occur on 64-bit hardware and thus
       * does not have to be checked for in binaries with an LP64 data
       * model.
       */
      if (elf_head.e_machine == EM_SPARC32PLUS) {
        char buf[257];  /* recommended buffer size from sysinfo man
                           page */
        long length;
        char* location;

        length = sysinfo(SI_ISALIST, buf, 257);
        if (length > 0) {
            location = JLI_StrStr(buf, "sparcv8plus ");
          if (location == NULL) {
            JLI_ReportErrorMessage(JVM_ERROR3);
            return JNI_FALSE;
          }
        }
      }
#endif
        JLI_ReportErrorMessage(DLL_ERROR1, __LINE__);
        JLI_ReportErrorMessage(DLL_ERROR2, jvmpath, dlerror());
        return JNI_FALSE;
    }

    // 动态接连这个 JNI_CreateJavaVM 这个函数 然后执行
    ifn->CreateJavaVM = (CreateJavaVM_t)
        dlsym(libjvm, "JNI_CreateJavaVM");
    if (ifn->CreateJavaVM == NULL) {
        JLI_ReportErrorMessage(DLL_ERROR2, jvmpath, dlerror());
        return JNI_FALSE;
    }

    ifn->GetDefaultJavaVMInitArgs = (GetDefaultJavaVMInitArgs_t)
        dlsym(libjvm, "JNI_GetDefaultJavaVMInitArgs");
    if (ifn->GetDefaultJavaVMInitArgs == NULL) {
        JLI_ReportErrorMessage(DLL_ERROR2, jvmpath, dlerror());
        return JNI_FALSE;
    }

    ifn->GetCreatedJavaVMs = (GetCreatedJavaVMs_t)
        dlsym(libjvm, "JNI_GetCreatedJavaVMs");
    if (ifn->GetCreatedJavaVMs == NULL) {
        JLI_ReportErrorMessage(DLL_ERROR2, jvmpath, dlerror());
        return JNI_FALSE;
    }

    return JNI_TRUE;
}

int JNICALL
JavaMain(void * _args)
{
    JavaMainArgs *args = (JavaMainArgs *)_args;
    int argc = args->argc;
    char **argv = args->argv;
    int mode = args->mode;
    char *what = args->what;
    InvocationFunctions ifn = args->ifn;

    JavaVM *vm = 0;
    JNIEnv *env = 0;
    jclass mainClass = NULL;
    jclass appClass = NULL; // actual application class being launched
    jmethodID mainID;
    jobjectArray mainArgs;
    int ret = 0;
    jlong start, end;

    RegisterThread();

    /* Initialize the virtual machine */
    start = CounterGet();
    if (!InitializeJVM(&vm, &env, &ifn)) {
        JLI_ReportErrorMessage(JVM_ERROR1);
        exit(1);
    }

    if (showSettings != NULL) {
        ShowSettings(env, showSettings);
        CHECK_EXCEPTION_LEAVE(1);
    }

    if (printVersion || showVersion) {
        PrintJavaVersion(env, showVersion);
        CHECK_EXCEPTION_LEAVE(0);
        if (printVersion) {
            LEAVE();
        }
    }

    /* If the user specified neither a class name nor a JAR file */
    if (printXUsage || printUsage || what == 0 || mode == LM_UNKNOWN) {
        PrintUsage(env, printXUsage);
        CHECK_EXCEPTION_LEAVE(1);
        LEAVE();
    }

    FreeKnownVMs();  /* after last possible PrintUsage() */

    if (JLI_IsTraceLauncher()) {
        end = CounterGet();
        JLI_TraceLauncher("%ld micro seconds to InitializeJVM\n",
               (long)(jint)Counter2Micros(end-start));
    }

    /* At this stage, argc/argv have the application's arguments */
    if (JLI_IsTraceLauncher()){
        int i;
        printf("%s is '%s'\n", launchModeNames[mode], what);
        printf("App's argc is %d\n", argc);
        for (i=0; i < argc; i++) {
            printf("    argv[%2d] = '%s'\n", i, argv[i]);
        }
    }

    ret = 1;

    /*
     * Get the application's main class.
     *
     * See bugid 5030265.  The Main-Class name has already been parsed
     * from the manifest, but not parsed properly for UTF-8 support.
     * Hence the code here ignores the value previously extracted and
     * uses the pre-existing code to reextract the value.  This is
     * possibly an end of release cycle expedient.  However, it has
     * also been discovered that passing some character sets through
     * the environment has "strange" behavior on some variants of
     * Windows.  Hence, maybe the manifest parsing code local to the
     * launcher should never be enhanced.
     *
     * Hence, future work should either:
     *     1)   Correct the local parsing code and verify that the
     *          Main-Class attribute gets properly passed through
     *          all environments,
     *     2)   Remove the vestages of maintaining main_class through
     *          the environment (and remove these comments).
     *
     * This method also correctly handles launching existing JavaFX
     * applications that may or may not have a Main-Class manifest entry.
     */
    mainClass = LoadMainClass(env, mode, what);
    CHECK_EXCEPTION_NULL_LEAVE(mainClass);
    /*
     * In some cases when launching an application that needs a helper, e.g., a
     * JavaFX application with no main method, the mainClass will not be the
     * applications own main class but rather a helper class. To keep things
     * consistent in the UI we need to track and report the application main class.
     */
    appClass = GetApplicationClass(env);
    NULL_CHECK_RETURN_VALUE(appClass, -1);
    /*
     * PostJVMInit uses the class name as the application name for GUI purposes,
     * for example, on OSX this sets the application name in the menu bar for
     * both SWT and JavaFX. So we'll pass the actual application class here
     * instead of mainClass as that may be a launcher or helper class instead
     * of the application class.
     */
    PostJVMInit(env, appClass, vm);
    /*
     * The LoadMainClass not only loads the main class, it will also ensure
     * that the main method's signature is correct, therefore further checking
     * is not required. The main method is invoked here so that extraneous java
     * stacks are not in the application stack trace.
     */
    mainID = (*env)->GetStaticMethodID(env, mainClass, "main",
                                       "([Ljava/lang/String;)V");
    CHECK_EXCEPTION_NULL_LEAVE(mainID);

    /* Build platform specific argument array */
    mainArgs = CreateApplicationArgs(env, argv, argc);
    CHECK_EXCEPTION_NULL_LEAVE(mainArgs);

    /* Invoke main method. */
    (*env)->CallStaticVoidMethod(env, mainClass, mainID, mainArgs);

    /*
     * The launcher's exit code (in the absence of calls to
     * System.exit) will be non-zero if main threw an exception.
     */
    ret = (*env)->ExceptionOccurred(env) == NULL ? 0 : 1;
    LEAVE();
}

/*

/*
 * Initializes the Java Virtual Machine. Also frees options array when
 * finished.
 */
static jboolean
InitializeJVM(JavaVM **pvm, JNIEnv **penv, InvocationFunctions *ifn)
{
    JavaVMInitArgs args;
    jint r;

    memset(&args, 0, sizeof(args));
    args.version  = JNI_VERSION_1_2;
    args.nOptions = numOptions;
    args.options  = options;
    args.ignoreUnrecognized = JNI_FALSE;

    if (JLI_IsTraceLauncher()) {
        int i = 0;
        printf("JavaVM args:\n    ");
        printf("version 0x%08lx, ", (long)args.version);
        printf("ignoreUnrecognized is %s, ",
               args.ignoreUnrecognized ? "JNI_TRUE" : "JNI_FALSE");
        printf("nOptions is %ld\n", (long)args.nOptions);
        for (i = 0; i < numOptions; i++)
            printf("    option[%2d] = '%s'\n",
                   i, args.options[i].optionString);
    }

    r = ifn->CreateJavaVM(pvm, (void **)penv, &args);
    JLI_MemFree(options);
    return r == JNI_OK;
}

static jclass helperClass = NULL;

地址:

000000000085d4e0 动态连接库的偏移量

0x7ffff66ee4e0 CreateJavaVM 运行时的地址

0x7ffff66ee4e0L - 0x000000000085d4e0L = 7ffff5e91000

pmap -x 5018 使用 pmap 获取到真实的JVM地址

运行时的地址 - 偏移量 = 起始地址

00007ffff5e91000:

readelf --dyn-syms /home/muliao/software/openjdk-jdk8-b118/build/linux-x86_64-normal-server-slowdebug/jdk/lib/amd64/server/libjvm.so | grep JNI_CreateJavaVM

459: 000000000085d4e0   530 FUNC    GLOBAL DEFAULT   14 JNI_CreateJavaVM@@SUNWprivate_1.1

堆内存分配:

如果一个函数需要调用另一个函数的返回值,但是这个返回值的是个函数内部的一个地址,如果函数被释放,这地址中的值,就是脏数据,如果当前函数去接受这个地址取数据,就有可能取到一些莫名其妙的数据,这个也称为野指针。这个时候就需要堆内存分配。

如果有两个函数,需要共享一个数据,这个时候就需要去堆内存中分配空间。

堆内存分配

第一种:使用栈的方式分配内存,如果需要内存则推高地址即可。

如果想要释放空间,则地址之下的内存如何释放。回退还是?

第二种:使用剩余空间进行分配,就会出现内存碎片的问题,还需要维护内存分配映射。

这个时候我们就可以使用两种方式的结合。

如果使用的大内存使用直接空余空间分配,如果使用的是小内存则使用推高地址分配内存。

;