blog
blog copied to clipboard
ELF格式 相关
总结
- 流程:编译器编译出.asm汇编,使用nasm汇编编译器将asm文件编译为o文件,再使用id链接器将多个o文件链接为elf可执行文件
- https://blog.csdn.net/ZCShouCSDN/article/details/100048461
- 通过rust的
riscv32imac-unknown-none-elf
之类的工具链生产elf文件 https://juejin.im/entry/5c0d23b35188253b7e7480db https://tech.meituan.com/2015/01/22/linker.html https://www.bilibili.com/video/BV1e54y1d74j?from=search&seid=10838174923397315020 https://github.com/ruslashev/elfcat - https://github.com/m4b/goblin
- 好:https://github.com/gz/rust-elfloader
- 连接器:https://github.com/cisen/blog/issues/1127
- BIN文件
-
- BIN文件是将elf文件中的代码段,数据段,还有一些自定义的段抽取出来做成的一个内存的镜像。
-
- 是 raw binary 文件,这种文件只包含机器码。
-
- 内部没有地址标记。一般用编程器烧写时从00开始
-
- ELF文件除了机器码外,还包含其它额外的信息,如段的加载地址,运行地址,重定位表,符号表等。所以ELF文件的体积一般比对应的BIN文件要大。
结构
- elf头
-
- 声明elf格式和版本
-
- e_entry程序入口的虚拟地址
-
- e_phoff 程序头地址
-
- e_shoff section header地址
-
- e_flags cpu
- 程序头(又叫:Segments,执行用,执行时各个section内存位置)
-
- 紧紧跟着elf头,程序头和创建进程有关,描述了连续的几个section在文件中的位置、大小以及放到内存后的位置和大小。用于构建进程映像的目标文件必须有程序头,而可重定向的目标文件不需要段头表。
PT_LOAD:描述可以装载的段,可以被加载到内存中。
PT_DYNAMIC:动态段是动态链接可执行文件特有的,包含了动态链接器所必须的一些信息。
PT_INTERP:只将位置和大小信息存放在一个null结尾的字符串中,是对程序解释器位置的描述。
PT_NOTE:这个类型的段可能保存了与特定供应商或者系统相关的附加信息。
PT_PHDR:保存程序头表本身的位置和大小。
- 15个section
-
- .text(代码),.rodata(只读全局变量),.data(可读写全局变量)
.text :此节保存了程序代码指令。
.rodata:此节保存了只读数据,例如打印的字符串的命令printf("helloworld\n"):Contents of section .rodata:
0750 01000200 68656c6c 6f20776f 726c6400 ....hello world.
.plt : 包含了动态链接器调用从共享库导入的函数所必须的相关代码
.data :保存了初始化的全局变量等数据
.bss : 保存了未进行初始化的全局数据,占用空间不超过4字节,进表示了这个节本身的空间。程序加载时数据会被初始化为0,在程序执行期间可以进行赋值。由于.bss没有保存实际数据,此节类型为SHT_NOBITS。
.got.plt : .got节保存了全局偏移表。.got和.plt节一起提供了对导入的共享库函数的访问入口,有动态链接器在运行时进行修改。
.dynsym : 保存了从共享库导入的动态符号信息。
.dynstr : 保存了动态符号字符串表,表中存了一系列字符串,代表了符号名称,以空字符为结尾。
.rela.plt .rela.dyn : 保存了与重定位相关的信息,这些信息描述了如何在链接或运行时,对ELF目标文件的某部分内容进行补充或修改
.gnu.hash : 保存了一个用于查找符号的散列表。
.symtab : 保存了ElfN_Sym 类型的符号信息。.symtab保存了所有的符号,包括了.dynsym中的动态/全局符号。
.strtab :保存符号字符串表,其内容会被ElfN_Sym结构中的st_name条目引用。
.shstrtab: 保存了节头字符串表,字符串就是每个节的名称。
- String and symbol tables
- 节头表(最后)(链接用,链接时各个section的位置)
-
- 节头表描述了该文件中各个section的名称,大小以及在文件的位置等信息。用于链接的文件必须要有节头表,其他的目标文件可以不要节头表。
-
- 通过文件的节头表,可以找到elf的所有section。 节标头表是Elf32_Shdr或Elf64_Shdr结构的数组。
- 通过文件的节头表,可以找到elf的所有section。 节标头表是Elf32_Shdr或Elf64_Shdr结构的数组。
定义
ELF(Executable and Linkable Format)的含义是可执行与可链接的格式。其实ELF是一种通用的标准格式,不只是用于Linux系统。ELF文件格式最初是用在Unix操作系统System V Release 4上,后来迅速的被各种Unix系统所采用。如今,ELF文件格式可以算是类Unix系统的事实标准了。ELF文件格式灵活,易扩展,不与特定的CPU或指令集绑定,可以被移植到任何操作系统或硬件架构上,包括Windows[1]。工具接口标准委员会将ELF标准定为一种可移植的目标文件格式。ELF标准的目的是为软件开发人员提供一组二进制接口定义,覆盖了多种操作系统,减少了重新编码、重新编译的负担。工具接口标准委员会给出的Portable Format Specification中主要针对三种不同类型的目标文件做出规定,并规定了程序加载与动态链接相关过程细节。
根据ELF标准中的定义,目标文件有三种类型:
- 可重定位文件(Relocatable File):此文件包含了重定向信息,可以与其他目标文件进行链接,创建可执行的文件或者共享目标文件
- 可执行文件(Executable File):可以被操作系统加载执行的文件
- 共享目标文件(Shared Object File):有两种用途:编译器可以将它与其他可重定向文件和共享目标文件一起处理,生成另一个目标文件;动态连接器(Dynamic Linker) 可以将它与某个可执行文件以及其他共享目标文件组合,被操作系统用来创建进程。
问答
elf是如何调用机器执行的?
- https://blog.csdn.net/weiqi7777/article/details/87477427
- 加载到系统
- 分配内存,创建进程
http://www.360doc.com/content/18/0129/11/7377734_726086779.shtml ELF文件格式是Linux系统下的可执行文件格式,在系统中属于最重要的文件类型。因为ELF文件是程序的载体,所有通过编译器编译过的代码,最后生成的程序就是ELF格式的文件。操作系统会读取ELF格式的文件(也就是程序)到内存中,根据ELF文件中的指令,数据与符号,生成可以运行的进程。通过操作系统对进程的管理机制,进程被分配给空闲的CPU执行。由此可以看出,ELF文件格式代表了程序模型。了解ELF文件格式定义,一方面对程序的运行机制有较为透彻的理解,另一方面对系统性能优化与系统安全有很大的帮助。
ELF(Executable and Linkable Format)的含义是可执行与可链接的格式。其实ELF是一种通用的标准格式,不只是用于Linux系统。ELF文件格式最初是用在Unix操作系统System V Release 4上,后来迅速的被各种Unix系统所采用。如今,ELF文件格式可以算是类Unix系统的事实标准了。ELF文件格式灵活,易扩展,不与特定的CPU或指令集绑定,可以被移植到任何操作系统或硬件架构上,包括Windows[1]。工具接口标准委员会将ELF标准定为一种可移植的目标文件格式。ELF标准的目的是为软件开发人员提供一组二进制接口定义,覆盖了多种操作系统,减少了重新编码、重新编译的负担。工具接口标准委员会给出的Portable Format Specification中主要针对三种不同类型的目标文件做出规定,并规定了程序加载与动态链接相关过程细节。
根据ELF标准中的定义,目标文件有三种类型:
- 可重定位文件(Relocatable File):此文件包含了重定向信息,可以与其他目标文件进行链接,创建可执行的文件或者共享目标文件
- 可执行文件(Executable File):可以被操作系统加载执行的文件
- 共享目标文件(Shared Object File):有两种用途:编译器可以将它与其他可重定向文件和共享目标文件一起处理,生成另一个目标文件;动态连接器(Dynamic Linker) 可以将它与某个可执行文件以及其他共享目标文件组合,被操作系统用来创建进程。
二进制文件分析起来相对枯燥,很多文档是介绍文件格式定义,枯燥的定义很容易让人乏味。本文分析的主要思路是采用一个实际的ELF文件作为分析对象,逐步展开对ELF定义进行说明。通过这种方式了解二进制可能相对更直观一些。
测试程序main.c:
#include<stdio.h>
int main(void)
{
printf("hello world\n");
return 0;
}
使用gcc来编译程序:gcc main.c
$ file a.out
a.out: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=31a79aafe17372bd63c14e63fa3763ae5fb5bddb, not stripped
编译器gcc与clang生成的二进制有所区别。clang生成executable,而gcc生成 shared object。编译的程序中虚拟内存地址也有区别。
使用hexdump工具读取二进制文件的原始数据:
hexdump a.out
0000000 457f 464c 0102 0001 0000 0000 0000 0000
0000010 0002 003e 0001 0000 03f0 0040 0000 0000
0000020 0040 0000 0000 0000 1918 0000 0000 0000
0000030 0000 0000 0040 0038 0009 0040 001d 001c
0000040 0006 0000 0005 0000 0040 0000 0000 0000
0000050 0040 0040 0000 0000 0040 0040 0000 0000
0000060 01f8 0000 0000 0000 01f8 0000 0000 0000
0000070 0008 0000 0000 0000 0003 0000 0004 0000
0000080 0238 0000 0000 0000 0238 0040 0000 0000
0000090 0238 0040 0000 0000 001c 0000 0000 0000
00000a0 001c 0000 0000 0000 0001 0000 0000 0000
.
.
.
hexdump的输出是以十六进制的方式显示,最左边一栏显示字节序号。第一行一共有十六个字节(每个16进制数字需要4位,457f 就是16位2字节),从0000000到000000f。因此第二行的第一个字节的序号就是0000010。
在介绍具体细节前,需要了解一下ELF标准中在64位机器上的数据类型定义
Name(名称) Size(大小) Alignment(对齐) Purpose(含义)
Elf64_Addr 8 8 Unsigned program address(无符号程序地址)
Elf64_Off 8 8 Unsigned file offset(无符号文件偏移量)
Elf64_Half 2 2 Unsigned medium integer(无符号半整型)
Elf64_Word 4 4 Unsigned integer(无符号整型)
Elf64_Sword 4 4 Signed integer(有符号整型)
Elf64_Xword 8 8 Unsigned long integer(无符号长整型)
Elf64_Sxword 8 8 Signed long integer(有符号长整型)
unsigned char 1 1 Unsigned small integer(无符号字符)
上面基本上有四种大小的数据,地址与地址偏移量的类型大小都是8个字节,单个字符是1个字节,半整型为2个字节,一个整型是4个字节,长整型是8个字节。
ELF头部
ELF文件最开始的部分:ELF头部。ELF头部是ELF文件最开始的64个字节。hexdump命令可以按照一行16个字节来显示,-n 64是显示64 个字节。
$ hexdump -e '16/1 "%02x " "\n"' -n 64 -v a.out
7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
03 00 3e 00 01 00 00 00 80 05 00 00 00 00 00 00
40 00 00 00 00 00 00 00 f8 19 00 00 00 00 00 00
00 00 00 00 40 00 38 00 09 00 40 00 1f 00 1e 00
下面是ELF头部的结构体定义,可以对照来解释上面这64个字节其对应的具体含义。
typedef struct
{
unsigned char e_ident[16]; /* ELF identification 魔数,如:ELF02*/
Elf64_Half e_type; /* Object file type 文件类别*/
Elf64_Half e_machine; /* Machine type 编码方式 */
Elf64_Word e_version; /* Object file version ELF格式版本号 */
Elf64_Addr e_entry; /* Entry point address 填充字节 */
Elf64_Off e_phoff; /* Program header offset */
Elf64_Off e_shoff; /* Section header offset */
Elf64_Word e_flags; /* Processor-specific flags */
Elf64_Half e_ehsize; /* ELF header size */
Elf64_Half e_phentsize; /* Size of program header entry */
Elf64_Half e_phnum; /* Number of program header entries */
Elf64_Half e_shentsize; /* Size of section header entry */
Elf64_Half e_shnum; /* Number of section header entries */
Elf64_Half e_shstrndx; /* Section name string table index */
} Elf64_Ehdr;
最开始的16个字节是ELF标识e_ident,其中的包括以下几个部分:
魔数(Magic Number):7f 45 4c 46 。这四个十六进制数通知读这个文件的程序,我是一个ELF格式的文件。其中[45 4c 46]在ASCII编码中代表了ELF这三个字母。 文件类别: 02。这个字节代表这个目标文件是64位目标(ELFCLASS64)。如果是32位目标(ELFCLASS32),此字节值为01。 编码方式:01。这个字节代表此目标文件的字节序编码方式。01表示小端模式,02表示大端模式。小端模式与大端模式主要影响了字节在内存中存放的顺序。小端模式特点是低位字节存放在内存的低位地址,大端模式则相反,低位字节存放在内存的高位地址。一般来说通用计算机是采用小端模式。下文会用到这个特点。 ELF格式版本号:01。这个字节代表了执行ELF标准的版本号。目前使用的最新版本是1。 填充字节:00 00 00 00 00 00 00 00 00。这一行最后的所有0都是填充字节。填充字节是为了ELF格式今后的扩展预留空位。并且这些字节也起到了字节对齐的作用。 第17,18字节是ELF格式的文件定义e_type:03 00。ELF主要定义了三种类型的目标文件。在这里可重定位文件的值为1,可执行文件的值为2,共享目标文件的值为3。此外还定义一些特殊的,例如未知文件类型的值为0,core类型文件的值为4。这里用两个字节来表示一个整数,涉及到字节序大小端的问题。采用小端存储时,数字低位对应内存低地址,由于内存低地址是在左边,所以这两个字节所代表的数字就是0x0002,也就是2。
第19,20字节表示了此文件的目标机器的架构体系e_machine:3e 00。这个值换算为十进制是62,代表了"x86-64"架构。
第20-23这四个字节是ELF版本编号e_version:01 00 00 00。依然是在定义ELF标准版本,1代表了使用当前版本,0是表示使用非法版本。
后续8个字节代表了程序入口的虚拟地址e_entry:80 05 00 00 00 00 00 00。这个部分是告诉系统程序的入口地址,也就是程序的开始位置。64位系统的地址是8个字节,在32位的程序上,这个程序入口是4个字节来表示。
下面的16个字节表示在elf文件中程序头的偏移量e_phoff:40 00 00 00 00 00 00 00和节头偏移量:f8 19 00 00 00 00 00 00。
程序头(program header)是对二进制文件中段(segment)的描述,是程序装载到内存中运行所必需的部分。段(segment)是可执行程序的内存布局,程序头描述了这些段如何映射到内存中。段是程序执行的必要组成,在每个段中,会有代码或者数据被划分为不同的节。 节头(section header)是对节(section)的位置和大小的描述,主要用于调试和链接。节头对于程序执行来说不是必需的,没有节头程序也可以正常执行,但是对于调试器与反编译器,需要依赖节头提供调试信息。
在这里,可以根据偏移量获取程序头表与节头表。程序头在文件中的偏移量为0x40 =4×16 = 64。这意味着程序头表起始的索引是64。节头偏移量同理可以算出来,0x19f8 = 6648。
接下来4个字节是处理器的标志位e_flags:00 00 00 00。保存与文件相关的,用于处理器的标志。
之后2个字节代表ELF头的大小e_ehsize:40 00。通常就是64。 再往后2个字节代表程序头的大小e_phentsiz:38 00。这里是56个字节。 后面2个字节代表了程序头的项数 e_phnum:09 00。这里有9个程序头
紧跟在后面的就是节头表大小与节头表项目e_shentsize,e_shnum :40 00 1f 00,这里节头表为64个字节,一共有31个节头。
最后两个字节代表节头表格中与节中名称字符串的索引:1e 00。
以上就是ELF文件头的全部内容,文件头主要包含了类型,结构,程序起始的地址等信息。
程序头表(Program Header Table)
根据ELF文件头中e_phoff的值,我们发现程序头表是在第64个字节,也就是说程序头表紧跟在ELF文件头后面。下面我们来看下程序头的内容。 ELF程序头是对二进制文件中段的描述,是程序装载必须的一部分。有5种比较常见的类型。
PT_LOAD:描述可以装载的段,可以被加载到内存中。
PT_DYNAMIC:动态段是动态链接可执行文件特有的,包含了动态链接器所必须的一些信息。
PT_INTERP:只将位置和大小信息存放在一个null结尾的字符串中,是对程序解释器位置的描述。
PT_NOTE:这个类型的段可能保存了与特定供应商或者系统相关的附加信息。
PT_PHDR:保存程序头表本身的位置和大小。
程序头的作用是描述各种类型的段的基本信息,即描述了程序在内存中是如何分布的。
程序头的定义:
typedef struct
{
Elf64_Word p_type; /* Type of segment */
Elf64_Word p_flags; /* Segment attributes */
Elf64_Off p_offset; /* Offset in file */
Elf64_Addr p_vaddr; /* Virtual address in memory */
Elf64_Addr p_paddr; /* Reserved */
Elf64_Xword p_filesz; /* Size of segment in file */
Elf64_Xword p_memsz; /* Size of segment in memory */
Elf64_Xword p_align; /* Alignment of segment */
} Elf64_Phdr;
PT_PHDR
PT_PHDR类型的程序头是程序头表的开始,描述了程序头表的起始地址与占用的字节大小。 程序头整个结构是56个字节,因此我们截取从第64个字节开始56个字节的内容:
$ hexdump -e '16/1 "%02x " "\n"' -s 64 -n 56 -v a.out
06 00 00 00 05 00 00 00 40 00 00 00 00 00 00 00
40 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00
f8 01 00 00 00 00 00 00 f8 01 00 00 00 00 00 00
08 00 00 00 00 00 00 00
根据结构定义来看下程序头的具体内容:
p_type : 06 00 00 00。表示段的类型,这里段的类型的值是6,类型为PT_PHDR。phdr类型的段保存了程序头表本身的位置和大小。
p_flags: 05 00 00 00。表示段的属性标志。高位2个字节预留给特殊处理器,低位2个字节则预留给特殊环境。0x05表示段属性为可读(0x04),可执行(0x01)。
p_offset:40 00 00 00 00 00 00 00。段的偏移量。这里的值是64。
p_vaddr:40 00 00 00 00 00 00 00。段在内存中的虚拟地址。
p_paddr:40 00 00 00 00 00 00 00。预留给段在内存中实际的物理地址。里面的值其实没有被用到,在Linux中,应用程序的内存是使用内存管理分配给的虚拟内存。本段的虚拟地址是0x00400040,
p_filesz:f8 01 00 00 00 00 00 00。此段在文件中所占字节数,值为504,代表了9个程序头:56*9=504
p_memsz:f8 01 00 00 00 00 00 00。此段在内存中所占字节数。
p_align:08 00 00 00 00 00 00 00。字段对齐。这个字段中的值必须是2的幂次。当为0和1时,表示不需要对齐。
p_vaddr和p_offset对p_align取模后应该相等。
PT_INTERP
$hexdump -e '16/1 "%02x " "\n"' -s 120 -n 56 -v a.out
03 00 00 00 04 00 00 00 38 02 00 00 00 00 00 00
38 02 00 00 00 00 00 00 38 02 00 00 00 00 00 00
1c 00 00 00 00 00 00 00 1c 00 00 00 00 00 00 00
01 00 00 00 00 00 00 00
按照定义,我们可以看出此段为PT_INTERP:
p_type : 3代表PT_INTERP类型
p_flag : 4代表此段的权限为可读
p_offset : 段的偏移量为0x0238
p_vaddr : 段在内存的虚拟机地址为0x00000000 00000238
p_paddr : 与虚拟地址相同的物理地址
p_filesz: 本段在文件中所占字节数为28
p_memsz: 本段在内存中所占字节数为28
p_align: 值为1,表示本段不需要对齐。
这个INTERP类型的段具体保存的内容是什么呢?定为到0x238的偏移地址处,查看28个字节,这部分内容就是该段的详细内容。
hexdump -e '16/1 "%02x " "\n"' -s 568 -n 28 -v a.out
2f 6c 69 62 36 34 2f 6c 64 2d 6c 69 6e 75 78 2d
78 38 36 2d 36 34 2e 73 6f 2e 32 00
这些数字代表了什么含义呢?根据前面介绍的,INTERP段保存了一个字符串,可以用objdump工具来反汇编查看详细信息:objdump -s a.out。截取相关部分内容如下:
Contents of section .interp:
0238 2f6c6962 36342f6c 642d6c69 6e75782d /lib64/ld-linux-
0248 7838362d 36342e73 6f2e3200 x86-64.so.2.
这部分就是链接器的路径的ASCII码。
PT_LOAD
为LOAD类型的段,表示可以被装载或者映射到内存中,一个可执行文件可以有多个LOAD类型的段。一个需要动态链接的ELF文件通常会有两个LOAD,一个用来存放程序代码的text,一般设置为可读与可执行权限。另一个用来存放全局变量和动态链接信息的data段,设置为可读可写权限。
hexdump -e '16/1 "%02x " "\n"' -s 176 -n 56 -v a.out
01 00 00 00 05 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
a8 08 00 00 00 00 00 00 a8 08 00 00 00 00 00 00
00 00 20 00 00 00 00 00
此段读写权限为可读可执行。偏移量为0,内存虚拟地址为0x00000000 00000000。文件大小为0x08a8,即2216个字节。
$ hexdump -e '16/1 "%02x " "\n"' -s 232 -n 56 -v a.out
01 00 00 00 06 00 00 00 d8 0d 00 00 00 00 00 00
d8 0d 20 00 00 00 00 00 d8 0d 20 00 00 00 00 00
58 02 00 00 00 00 00 00 60 02 00 00 00 00 00 00
00 00 20 00 00 00 00 00
此段是可读可写的权限。偏移量为0x200dd8,内存虚拟地址为0x200dd8。文件大小为0x0258。装载到内存中是0x0260个字节。
根据上面程序头的定义,ELF文件加载到内存中的状态是这样的:
-------Memory--------
| ELF Header | --|
| PHDR | |
| INTERP | |
| NOTE | | PT_LOAD 0
| . | |
| ----------------- | --|
| alignment |
| ----------------- |
| DYNAMIC | --|
| ----------------- | |
| . | |
| . | | PT_LOAD 1
| ----------------- | |
| alignment | --|
| ----------------- |
PT_DYNAMIC
PT_DYNAMIC是描述动态段的程序头:
$ hexdump -e '16/1 "%02x " "\n"' -s 288 -n 56 -v a.out
02 00 00 00 06 00 00 00 f0 0d 00 00 00 00 00 00
f0 0d 20 00 00 00 00 00 f0 0d 20 00 00 00 00 00
e0 01 00 00 00 00 00 00 e0 01 00 00 00 00 00 00
08 00 00 00 00 00 00 00
动态段的权限是可读可写:0x04+0x02。段的偏移量为0x0df0,虚拟内存地址为0x200df0。动态段的大小为0x01e0。
动态段中包含了一些标记值和指针,包括但不限于以下内容:
- 运行时需要链接的共享库列表
- 全局偏移表(GOT)的地址
- 重定位条目的相关信息
PT_NOTE
在动态段的程序头后面是PT_NOTE的程序头:
$ hexdump -e '16/1 "%02x " "\n"' -s 344 -n 56 -v a.out
04 00 00 00 04 00 00 00 54 02 00 00 00 00 00 00
54 02 00 00 00 00 00 00 54 02 00 00 00 00 00 00
44 00 00 00 00 00 00 00 44 00 00 00 00 00 00 00
04 00 00 00 00 00 00 00
PT_NOTE段的类型值是4,其权限为可读,偏移量为0x0254,虚拟地址为0x000254,大小为0x44。
NOTE段的内容为:
Contents of section .note.ABI-tag:
0254 04000000 10000000 01000000 474e5500 ............GNU.
0264 00000000 02000000 06000000 20000000 ............ ...
Contents of section .note.gnu.build-id:
0274 04000000 14000000 03000000 474e5500 ............GNU.
0284 31a79aaf e17372bd 63c14e63 fa3763ae 1....sr.c.Nc.7c.
0294 5fb5bddb _...
这部分内容只保存了操作系统的规范信息,在可执行文件运行时是不需要这段的。有的系统需要在目标文件上标记特定的信息,以便于其他程序对一致性、兼容性等进行检查。
另外三个程序头
ELF文件头显示有9个程序头,除了上面介绍的五个在ELF标准中定义的程序头,还有三个Linux系统支持的程序头。简单起见,我们使用readelf 工具来查看相关信息:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
GNU_EH_FRAME 0x0000000000000760 0x0000000000000760 0x0000000000000760
0x000000000000003c 0x000000000000003c R 0x4
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 0x10
GNU_RELRO 0x0000000000000dd8 0x0000000000200dd8 0x0000000000200dd8
0x0000000000000228 0x0000000000000228 R 0x1
- PT_GNU_EH_FRAME类型的段保存了一组数据(array element)。数据是在.eh_frame_hdr节中定义的异常处理信息的位置与大小。
- PT_GNU_STACK类型的段保存了包含堆栈的段的权限,规定了哪些栈是可以执行的。权限设置方式与p_flag相同,缺少这个段意味着堆栈都是可以执行的,本例就是这样的情况。(参考)
- PT_GNU_RELRO类型的段意味着这个段内的数据在重定向后是只读的。这个段通常包括了一些动态链接库,包括了.ctors, .dtors, .dynamic, .got 等节。这个段的区域与第二个LOAD段基本重合,只是PT_LOAD 1 段中多了.got.plt和.data。
节头表 (section)
ELF格式中的节(section)是ELF文件的主要组成部分,主要的数据与代码都存在节里。 根据ELF文件头,节头表的偏移量在0x19f8处。节头表对于程序来说是可有可无的,节头表主要描述ELF文件中节的相关信息,数据结构如下:
typedef struct
{
Elf64_Word sh_name; /* Section name */
Elf64_Word sh_type; /* Section type */
Elf64_Xword sh_flags; /* Section attributes */
Elf64_Addr sh_addr; /* Virtual address in memory */
Elf64_Off sh_offset; /* Offset in file */
Elf64_Xword sh_size; /* Size of section */
Elf64_Word sh_link; /* Link to other section */
Elf64_Word sh_info; /* Miscellaneous information */
Elf64_Xword sh_addralign; /* Address alignment boundary */
Elf64_Xword sh_entsize; /* Size of entries, if section has table */
} Elf64_Shdr;
一个节头的大小64个字节,下面读取前节头表的前两个节头
hexdump -e '16/1 "%02x " "\n"' -s 0x19f8 -n 128 -v a.out
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
1b 00 00 00 01 00 00 00 02 00 00 00 00 00 00 00
38 02 00 00 00 00 00 00 38 02 00 00 00 00 00 00
1c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
第一个节头类型为NULL,所有的内容都是0。 以第二个节头为例介绍:
sh_name:1b 00 00 00包含一个字节偏移量,是相对于名称字符串表.shstrtab的偏移量,下文会继续介绍。
sh_type:01 00 00 00节的类型,这里的值为1,类型是SHT_PROGBITS,表示此节的内容由程序来定义
sh_flags:02 00 00 00 00 00 00 00节的属性标志。02表示节的属性为SHF_ALLOC,代表此节会被分配到内存中。
sh_addr:38 02 00 00 00 00 00 00节在内存中起始虚拟地址
sh_offset:38 02 00 00 00 00 00 00节在ELF文件中的偏移量
sh_size:1c 00 00 00 00 00 00 00是此节的大小
sh_link:00 00 00 00包括一个相关的节的索引,这个字段的用途根据节的类型来定。
sh_info:00 00 00 00包括了此节的额外信息,这个字段的用途根据节的类型来定。
sh_addralign:01 00 00 00 00 00 00 00包含了此节需要对齐的字节数,必须为2的幂。如果是0或者1,表示此节无需对齐。
sh_entsize:00 00 00 00 00 00 00 00包含了每个条目的大小,有些节包含了固定大小的表(数组)。除了这种节以外,这个字段一般为0。
节头一共有31个,用readelf工具来列举ELF文件中的节头表:
$ readelf -S a.out
共有 31 个节头,从偏移量 0x19f8 开始: 节头:
[号] 名称 类型 地址 偏移量
大小 全体大小 旗标 链接 信息 对齐
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .interp PROGBITS 0000000000000238 00000238
000000000000001c 0000000000000000 A 0 0 1
[ 2] .note.ABI-tag NOTE 0000000000000254 00000254
0000000000000020 0000000000000000 A 0 0 4
[ 3] .note.gnu.build-i NOTE 0000000000000274 00000274
0000000000000024 0000000000000000 A 0 0 4
[ 4] .gnu.hash GNU_HASH 0000000000000298 00000298
000000000000001c 0000000000000000 A 5 0 8
[ 5] .dynsym DYNSYM 00000000000002b8 000002b8
00000000000000c0 0000000000000018 A 6 1 8
[ 6] .dynstr STRTAB 0000000000000378 00000378
0000000000000096 0000000000000000 A 0 0 1
[ 7] .gnu.version VERSYM 000000000000040e 0000040e
0000000000000010 0000000000000002 A 5 0 2
[ 8] .gnu.version_r VERNEED 0000000000000420 00000420
0000000000000020 0000000000000000 A 6 1 8
[ 9] .rela.dyn RELA 0000000000000440 00000440
00000000000000d8 0000000000000018 A 5 0 8
[10] .rela.plt RELA 0000000000000518 00000518
0000000000000018 0000000000000018 AI 5 24 8
[11] .init PROGBITS 0000000000000530 00000530
0000000000000017 0000000000000000 AX 0 0 4
[12] .plt PROGBITS 0000000000000550 00000550
0000000000000020 0000000000000010 AX 0 0 16
[13] .plt.got PROGBITS 0000000000000570 00000570
0000000000000008 0000000000000008 AX 0 0 8
[14] .text PROGBITS 0000000000000580 00000580
00000000000001c2 0000000000000000 AX 0 0 16
[15] .fini PROGBITS 0000000000000744 00000744
0000000000000009 0000000000000000 AX 0 0 4
[16] .rodata PROGBITS 0000000000000750 00000750
0000000000000010 0000000000000000 A 0 0 4
[17] .eh_frame_hdr PROGBITS 0000000000000760 00000760
000000000000003c 0000000000000000 A 0 0 4
[18] .eh_frame PROGBITS 00000000000007a0 000007a0
0000000000000108 0000000000000000 A 0 0 8
[19] .init_array INIT_ARRAY 0000000000200dd8 00000dd8
0000000000000008 0000000000000008 WA 0 0 8
[20] .fini_array FINI_ARRAY 0000000000200de0 00000de0
0000000000000008 0000000000000008 WA 0 0 8
[21] .jcr PROGBITS 0000000000200de8 00000de8
0000000000000008 0000000000000000 WA 0 0 8
[22] .dynamic DYNAMIC 0000000000200df0 00000df0
00000000000001e0 0000000000000010 WA 6 0 8
[23] .got PROGBITS 0000000000200fd0 00000fd0
0000000000000030 0000000000000008 WA 0 0 8
[24] .got.plt PROGBITS 0000000000201000 00001000
0000000000000020 0000000000000008 WA 0 0 8
[25] .data PROGBITS 0000000000201020 00001020
0000000000000010 0000000000000000 WA 0 0 8
[26] .bss NOBITS 0000000000201030 00001030
0000000000000008 0000000000000000 WA 0 0 1
[27] .comment PROGBITS 0000000000000000 00001030
0000000000000025 0000000000000001 MS 0 0 1
[28] .symtab SYMTAB 0000000000000000 00001058
0000000000000660 0000000000000018 29 47 8
[29] .strtab STRTAB 0000000000000000 000016b8
000000000000022f 0000000000000000 0 0 1
[30] .shstrtab STRTAB 0000000000000000 000018e7
000000000000010c 0000000000000000 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
l (large), p (processor specific)
最后一个.shstrtab保存了所有节的名称:
$ hexdump -e '16/1 "%02x " "\n"' -s 0x18e7 -n 268 -v a.out
00 2e 73 79 6d 74 61 62 00 2e 73 74 72 74 61 62
00 2e 73 68 73 74 72 74 61 62 00 2e 69 6e 74 65
72 70 00 2e 6e 6f 74 65 2e 41 42 49 2d 74 61 67
00 2e 6e 6f 74 65 2e 67 6e 75 2e 62 75 69 6c 64
2d 69 64 00 2e 67 6e 75 2e 68 61 73 68 00 2e 64
79 6e 73 79 6d 00 2e 64 79 6e 73 74 72 00 2e 67
6e 75 2e 76 65 72 73 69 6f 6e 00 2e 67 6e 75 2e
76 65 72 73 69 6f 6e 5f 72 00 2e 72 65 6c 61 2e
64 79 6e 00 2e 72 65 6c 61 2e 70 6c 74 00 2e 69
6e 69 74 00 2e 70 6c 74 2e 67 6f 74 00 2e 74 65
78 74 00 2e 66 69 6e 69 00 2e 72 6f 64 61 74 61
00 2e 65 68 5f 66 72 61 6d 65 5f 68 64 72 00 2e
65 68 5f 66 72 61 6d 65 00 2e 69 6e 69 74 5f 61
72 72 61 79 00 2e 66 69 6e 69 5f 61 72 72 61 79
00 2e 6a 63 72 00 2e 64 79 6e 61 6d 69 63 00 2e
67 6f 74 2e 70 6c 74 00 2e 64 61 74 61 00 2e 62
73 73 00 2e 63 6f 6d 6d 65 6e 74 00
之前示例的节的sh_name的值为1b,在这个表中就可以查到这个节的名称:
2e 69 6e 74 65 72 70 00 # . i n t e r p
节与段之间的的关系到底是怎样的呢?根据readelf工具,我们可以看到每个段对应都包含哪些节:
Section to Segment mapping:
段 段节...
PHDR 00
INTERP 01 .interp
LOAD 02 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .plt.got .text .fini .rodata .eh_frame_hdr .eh_frame
LOAD 03 .init_array .fini_array .jcr .dynamic .got .got.plt .data .bss
DYNAMIC 04 .dynamic
NOTE 05 .note.ABI-tag .note.gnu.build-id
GNU_EH_FRAME 06 .eh_frame_hdr
GNU_STACK 07
GNU_RELRO 08 .init_array .fini_array .jcr .dynamic .got
段描述了程序在内存中的布局,节包含了具体的数据、代码等信息。下面介绍几个主要的节
.text :此节保存了程序代码指令。
.rodata:此节保存了只读数据,例如打印的字符串的命令printf("helloworld\n"):Contents of section .rodata:
0750 01000200 68656c6c 6f20776f 726c6400 ....hello world.
.plt : 包含了动态链接器调用从共享库导入的函数所必须的相关代码
.data :保存了初始化的全局变量等数据
.bss : 保存了未进行初始化的全局数据,占用空间不超过4字节,进表示了这个节本身的空间。程序加载时数据会被初始化为0,在程序执行期间可以进行赋值。由于.bss没有保存实际数据,此节类型为SHT_NOBITS。
.got.plt : .got节保存了全局偏移表。.got和.plt节一起提供了对导入的共享库函数的访问入口,有动态链接器在运行时进行修改。
.dynsym : 保存了从共享库导入的动态符号信息。
.dynstr : 保存了动态符号字符串表,表中存了一系列字符串,代表了符号名称,以空字符为结尾。
.rela.plt .rela.dyn : 保存了与重定位相关的信息,这些信息描述了如何在链接或运行时,对ELF目标文件的某部分内容进行补充或修改
.gnu.hash : 保存了一个用于查找符号的散列表。
.symtab : 保存了ElfN_Sym 类型的符号信息。.symtab保存了所有的符号,包括了.dynsym中的动态/全局符号。
.strtab :保存符号字符串表,其内容会被ElfN_Sym结构中的st_name条目引用。
.shstrtab: 保存了节头字符串表,字符串就是每个节的名称。
参考连接 https://medium.com/@MrJamesFisher/understanding-the-elf-4bd60daac571
https://www.ibm.com/developerworks/cn/linux/l-excutff/
https://en.wikipedia.org/wiki/Executable_and_Linkable_Format
http://www.intezer.com/executable-linkable-format-101-part1-sections-segments/
https://gist.github.com/YunchengLiao/53d7b22c2fe39db96c9d
http://man7.org/linux/man-pages/man5/elf.5.html
https://www.cs.stevens.edu/~jschauma/810/elf.html
ELF 执行过程
1 , ELF(Executable and Linking Format)文件. ELF文件分为三种:
- 可重定位的对象文件(Relocatable file),也就是.o文件.
- 可执行的对象文件(Executable file)
- 可被共享的对象文件(Shared object file)也就是动态库文件,即 .so 文件。
ELF文件结构是这样的:
+-------------------+ | ELF文件头 | | | +-------------------+ | 程序头 | | (c0h字节) | +-------------------+ | 程序节 #1 | +-------------------+ | 程序节 #2 | +-------------------+ . . . . . . . . . . . . . . . . . . +-------------------+ | 程序节 #n | +-------------------+ | 节表头 | | (n*20h字节) |
其中涉及到三个结构体,定义在内核/include/linux/elf.h中.即elf文件头结构体Elf32_Ehdr;程序头结构体Elf32_Phdr;节结构体Elf32_Shdr; 这三个结构体就包含了elf文件的所有信息,信息量相当大.
对于这几个结构体的详细介绍,浏览:
http://hi.baidu.com/geekos/blog/item/b10c1751b53e991c367abe27.html
elf可执行文件执行过程也就是加载过程: 以helloworld程序为例子. void main(){ printf("Hello world!\n");} gcc -g helloworld.c -o hello(hello为可执行文件.)
1,可执行文件类型注册. 内核对所支持的每种可执行的程序类型都有个 struct linux_binfmt 的数据结构(详见program.c).这个结构体中有一个load_binary函数指针.其实根据elf文件格式 可以知道,.load_binary=.load_elf_binary.要支持 ELF 文件的运行,则必须向内核登记这个数据结构,加入到内核支持的可执行程序的队列中。内核提供两个函数来完成这个功能,一个注册,一个注销. 函数如下: int register_binfmt(struct linux_binfmt * fmt) int unregister_binfmt(struct linux_binfmt * fmt) 当需要运行一个程序时,则扫描这个队列,让各个数据结构所提供的处理程序(ELF中即为 load_elf_binary)逐一前来认领,如果某个格式的处理程序发现相符后,便执行该格式映像的装入和启动。
2,内核空间的加载: 内核中实际执行 execv()或 execve()系统调用的程序是 do_execve(),这个函数先打开目标映像文件,并从目标文件的头部(第一个字节开始)读入若干(当前 Linux 内核中是 128)字节 (实际上就是填充 ELF文件头),然后调用另一个函数search_binary_handler(),在此函数里面,它会搜索我们上面提到的 Linux 支持的可执行文件类型队列,让各种可执行程序的处理程序前来认领和处理。 如果类型匹配,则调用 load_binary函数指针所指向的处理函数来处理目标映像文件。在load_elf_binary之前,内核从elf文件的头128字节(即elf文件头)中获得了elf头文件的信息,包括文件类型,运行的机器, 版本,程序头个数和大小.节个数和大小,程序头和节的地址等. 接下来通过kernel_read读入整个程序头,然后在程序头中读入程序的"解释器"到缓冲区(此时的解释器只是一个字符串,这个字符串就是解释器的名称, 比如"/lib/ld-linux.so.2").接着执行程序根据目标镜像文件程序头中类型为PT_LOAD的段,因为只有PT_LOAD类型的段才是能加载的.在加载过程中,根据程序头给出的p_vaddr确定装入地址后,根据可加载段的大小. 通过elf_map()函数在用户空间中建立用户空间虚拟地址于目标镜像文件中某个连续区间的联系.elf_map()函数的返回值就是实际映射的起始地址. 对于用户空间的入口地址,假如有解释器的加入,则把入口地址设置为load_elf_interp()的返回值.即解释器的镜像入口地址.但若不装入解释器,那么这个用户空间的入口地址就是目标镜像本身的入口地址. 从此,程序的控制权就交给了解释器.然后解释器加载动态库,将每一个有依赖关系的动态库都加载到内存,形成一个链表,后面的符号解析过程主要就是在这个链表中搜索符号的定义.
3,ELF文件中符号的动态解析过程: 在helloworld.c编译成汇编代码后,其中的printf()函数,对应的汇编代码为:call 80482bcputs@plt. 而puts@plt这个标号就将在刚才动态链接库加载生成的符号链表中,从而得到printf函数的绝对地址,载入内存,运行程序:
程序的基本处理过程 https://www.dazhuanlan.com/2019/10/17/5da829a6cab2e/ ELF文件格式 ELF文件分类 ELF文件作用 ELF文件结构 链接脚本 链接过程 链接脚本的功能 链接脚本基础 关键字ENTRY 关键字SECTIONS 程序的基本处理过程 从源代码开始到形成可执行文件,需要经过预处理、编译、汇编、链接几个步骤。(这里主要是针对c语言程序)编译器主要是把源代码翻译成中间语言,这里是汇编语言,然后用汇编成目标文件,最后目标文件被链接器处理生成可执行文件。
ELF文件格式 编译器生成的文件叫目标文件,当前常见的是windows下的PE格式文件和Linux下的ELF格式文件。
ELF文件分类 其中ELF目标文件可以分为如下三类:
可重定位文件 这种文件可以作为链接器的输入,用来生成可执行文件或是共享文件。
可执行文件 共享文件 这种文件可以被可执行那个文件或其他共享文件调用。
ELF文件作用 由上面分类可知,ELF文件主要参与程序的链接和程序的执行过程。在链接过程中,ELF文件被看做是节(section)的集合,所有的节由节头表描述。在程序执行过程中,ELF文件被看成是段的集合,由程序头表描述。
ELF文件结构 ELF文件主要由以下4部分组成:
ELF Header 指明该文件的属性信息,如该文件的类型,节头表和段头表在该文件的位置等信息。
Program Header table 段头表和创建进程有关,描述了连续的几个节在文件中的位置、大小以及放到内存后的位置和大小。用于构建进程映像的目标文件必须有段头表,而可重定向的目标文件不需要段头表。
Section 编译后的目标文件至少有3个节区:
.text
代码段,存放程序执行代码的一块区域;
.data
数据段,存放程序中已经初始化的全局变量的区域;
.bss
bbs段,存放程序中未初始化的全局变量的区域;
Section header table 节头表描述了该文件中各个节区的名称,大小以及在文件的位置等信息。用于链接的文件必须要有节头表,其他的目标文件可以不要节头表。
链接脚本 链接过程 链接过程主要完成两件事:符号解析和重定位。
符号解析
找到当前文件中未定义符号在共享文件中的位置。
重定位
确定各个函数加载到内存中的运行地址。
链接脚本的功能 程序的链接过程是由链接脚本控制的,基本上只完成两个操作:
输入文件中的section如何映射到输出文件; 控制输出文件的内存分布; 默认情况下,链接器会使用默认的链接脚本,用参数-T参数可以指定自己的链接脚本。通常链接器的输入是编译器或汇编器输出的.o目标文件,在Linux平台输出的是ELF格式的目标文件或可执行文件。
链接脚本基础 关键字ENTRY 将某个符号的地址作为该可执行文件的入口地址,使用方式如下:
ENTRY(symbol_name)
关键字SECTIONS SECTIONS命令告诉链接器如何把输入文件中的节映射到输出文件中。下面是一个简单的例子
SECTIONS
{
. = 0x10000;
.text : { *(.text) }
}
其中,花括号中的第一行表示把定位器设置在某个地址位置,使得后续的内容从这个地址开始存放。第二行是把所有输入文件中的.text代码段都合并放到输出文件的.text代码段中。
官方声明
- https://www.man7.org/linux/man-pages/man5/elf.5.html
ELF(5) Linux Programmer's Manual ELF(5) NAME top elf - format of Executable and Linking Format (ELF) files SYNOPSIS top #include <elf.h> DESCRIPTION top The header file <elf.h> defines the format of ELF executable binary files. Amongst these files are normal executable files, relocatable object files, core files, and shared objects.
An executable file using the ELF file format consists of an ELF
header, followed by a program header table or a section header table,
or both. The ELF header is always at offset zero of the file. The
program header table and the section header table's offset in the
file are defined in the ELF header. The two tables describe the rest
of the particularities of the file.
This header file describes the above mentioned headers as C
structures and also includes structures for dynamic sections,
relocation sections and symbol tables.
Basic types The following types are used for N-bit architectures (N=32,64, ElfN stands for Elf32 or Elf64, uintN_t stands for uint32_t or uint64_t):
ElfN_Addr Unsigned program address, uintN_t
ElfN_Off Unsigned file offset, uintN_t
ElfN_Section Unsigned section index, uint16_t
ElfN_Versym Unsigned version symbol information, uint16_t
Elf_Byte unsigned char
ElfN_Half uint16_t
ElfN_Sword int32_t
ElfN_Word uint32_t
ElfN_Sxword int64_t
ElfN_Xword uint64_t
(Note: the *BSD terminology is a bit different. There, Elf64_Half is
twice as large as Elf32_Half, and Elf64Quarter is used for uint16_t.
In order to avoid confusion these types are replaced by explicit ones
in the below.)
All data structures that the file format defines follow the "natural"
size and alignment guidelines for the relevant class. If necessary,
data structures contain explicit padding to ensure 4-byte alignment
for 4-byte objects, to force structure sizes to a multiple of 4, and
so on.
ELF header (Ehdr) The ELF header is described by the type Elf32_Ehdr or Elf64_Ehdr:
#define EI_NIDENT 16
typedef struct {
unsigned char e_ident[EI_NIDENT];
uint16_t e_type;
uint16_t e_machine;
uint32_t e_version;
ElfN_Addr e_entry;
ElfN_Off e_phoff;
ElfN_Off e_shoff;
uint32_t e_flags;
uint16_t e_ehsize;
uint16_t e_phentsize;
uint16_t e_phnum;
uint16_t e_shentsize;
uint16_t e_shnum;
uint16_t e_shstrndx;
} ElfN_Ehdr;
The fields have the following meanings:
e_ident
This array of bytes specifies how to interpret the file, inde‐
pendent of the processor or the file's remaining contents.
Within this array everything is named by macros, which start
with the prefix EI_ and may contain values which start with
the prefix ELF. The following macros are defined:
EI_MAG0
The first byte of the magic number. It must be filled
with ELFMAG0. (0: 0x7f)
EI_MAG1
The second byte of the magic number. It must be filled
with ELFMAG1. (1: 'E')
EI_MAG2
The third byte of the magic number. It must be filled
with ELFMAG2. (2: 'L')
EI_MAG3
The fourth byte of the magic number. It must be filled
with ELFMAG3. (3: 'F')
EI_CLASS
The fifth byte identifies the architecture for this
binary:
ELFCLASSNONE This class is invalid.
ELFCLASS32 This defines the 32-bit architecture. It
supports machines with files and virtual
address spaces up to 4 Gigabytes.
ELFCLASS64 This defines the 64-bit architecture.
EI_DATA
The sixth byte specifies the data encoding of the pro‐
cessor-specific data in the file. Currently, these
encodings are supported:
ELFDATANONE Unknown data format.
ELFDATA2LSB Two's complement, little-endian.
ELFDATA2MSB Two's complement, big-endian.
EI_VERSION
The seventh byte is the version number of the ELF spec‐
ification:
EV_NONE Invalid version.
EV_CURRENT Current version.
EI_OSABI
The eighth byte identifies the operating system and ABI
to which the object is targeted. Some fields in other
ELF structures have flags and values that have plat‐
form-specific meanings; the interpretation of those
fields is determined by the value of this byte. For
example:
ELFOSABI_NONE Same as ELFOSABI_SYSV
ELFOSABI_SYSV UNIX System V ABI
ELFOSABI_HPUX HP-UX ABI
ELFOSABI_NETBSD NetBSD ABI
ELFOSABI_LINUX Linux ABI
ELFOSABI_SOLARIS Solaris ABI
ELFOSABI_IRIX IRIX ABI
ELFOSABI_FREEBSD FreeBSD ABI
ELFOSABI_TRU64 TRU64 UNIX ABI
ELFOSABI_ARM ARM architecture ABI
ELFOSABI_STANDALONE Stand-alone (embedded) ABI
EI_ABIVERSION
The ninth byte identifies the version of the ABI to
which the object is targeted. This field is used to
distinguish among incompatible versions of an ABI. The
interpretation of this version number is dependent on
the ABI identified by the EI_OSABI field. Applications
conforming to this specification use the value 0.
EI_PAD Start of padding. These bytes are reserved and set to
zero. Programs which read them should ignore them.
The value for EI_PAD will change in the future if cur‐
rently unused bytes are given meanings.
EI_NIDENT
The size of the e_ident array.
e_type This member of the structure identifies the object file type:
ET_NONE An unknown type.
ET_REL A relocatable file.
ET_EXEC An executable file.
ET_DYN A shared object.
ET_CORE A core file.
e_machine
This member specifies the required architecture for an indi‐
vidual file. For example:
EM_NONE An unknown machine
EM_M32 AT&T WE 32100
EM_SPARC Sun Microsystems SPARC
EM_386 Intel 80386
EM_68K Motorola 68000
EM_88K Motorola 88000
EM_860 Intel 80860
EM_MIPS MIPS RS3000 (big-endian only)
EM_PARISC HP/PA
EM_SPARC32PLUS SPARC with enhanced instruction set
EM_PPC PowerPC
EM_PPC64 PowerPC 64-bit
EM_S390 IBM S/390
EM_ARM Advanced RISC Machines
EM_SH Renesas SuperH
EM_SPARCV9 SPARC v9 64-bit
EM_IA_64 Intel Itanium
EM_X86_64 AMD x86-64
EM_VAX DEC Vax
e_version
This member identifies the file version:
EV_NONE Invalid version
EV_CURRENT Current version
e_entry
This member gives the virtual address to which the system
first transfers control, thus starting the process. If the
file has no associated entry point, this member holds zero.
e_phoff
This member holds the program header table's file offset in
bytes. If the file has no program header table, this member
holds zero.
e_shoff
This member holds the section header table's file offset in
bytes. If the file has no section header table, this member
holds zero.
e_flags
This member holds processor-specific flags associated with the
file. Flag names take the form EF_`machine_flag'. Currently,
no flags have been defined.
e_ehsize
This member holds the ELF header's size in bytes.
e_phentsize
This member holds the size in bytes of one entry in the file's
program header table; all entries are the same size.
e_phnum
This member holds the number of entries in the program header
table. Thus the product of e_phentsize and e_phnum gives the
table's size in bytes. If a file has no program header,
e_phnum holds the value zero.
If the number of entries in the program header table is larger
than or equal to PN_XNUM (0xffff), this member holds PN_XNUM
(0xffff) and the real number of entries in the program header
table is held in the sh_info member of the initial entry in
section header table. Otherwise, the sh_info member of the
initial entry contains the value zero.
PN_XNUM
This is defined as 0xffff, the largest number e_phnum
can have, specifying where the actual number of program
headers is assigned.
e_shentsize
This member holds a sections header's size in bytes. A sec‐
tion header is one entry in the section header table; all
entries are the same size.
e_shnum
This member holds the number of entries in the section header
table. Thus the product of e_shentsize and e_shnum gives the
section header table's size in bytes. If a file has no sec‐
tion header table, e_shnum holds the value of zero.
If the number of entries in the section header table is larger
than or equal to SHN_LORESERVE (0xff00), e_shnum holds the
value zero and the real number of entries in the section
header table is held in the sh_size member of the initial
entry in section header table. Otherwise, the sh_size member
of the initial entry in the section header table holds the
value zero.
e_shstrndx
This member holds the section header table index of the entry
associated with the section name string table. If the file
has no section name string table, this member holds the value
SHN_UNDEF.
If the index of section name string table section is larger
than or equal to SHN_LORESERVE (0xff00), this member holds
SHN_XINDEX (0xffff) and the real index of the section name
string table section is held in the sh_link member of the ini‐
tial entry in section header table. Otherwise, the sh_link
member of the initial entry in section header table contains
the value zero.
Program header (Phdr) An executable or shared object file's program header table is an array of structures, each describing a segment or other information the system needs to prepare the program for execution. An object file segment contains one or more sections. Program headers are meaningful only for executable and shared object files. A file spec‐ ifies its own program header size with the ELF header's e_phentsize and e_phnum members. The ELF program header is described by the type Elf32_Phdr or Elf64_Phdr depending on the architecture:
typedef struct {
uint32_t p_type;
Elf32_Off p_offset;
Elf32_Addr p_vaddr;
Elf32_Addr p_paddr;
uint32_t p_filesz;
uint32_t p_memsz;
uint32_t p_flags;
uint32_t p_align;
} Elf32_Phdr;
typedef struct {
uint32_t p_type;
uint32_t p_flags;
Elf64_Off p_offset;
Elf64_Addr p_vaddr;
Elf64_Addr p_paddr;
uint64_t p_filesz;
uint64_t p_memsz;
uint64_t p_align;
} Elf64_Phdr;
The main difference between the 32-bit and the 64-bit program header
lies in the location of the p_flags member in the total struct.
p_type This member of the structure indicates what kind of segment
this array element describes or how to interpret the array
element's information.
PT_NULL
The array element is unused and the other members'
values are undefined. This lets the program header
have ignored entries.
PT_LOAD
The array element specifies a loadable segment,
described by p_filesz and p_memsz. The bytes from
the file are mapped to the beginning of the memory
segment. If the segment's memory size p_memsz is
larger than the file size p_filesz, the "extra"
bytes are defined to hold the value 0 and to follow
the segment's initialized area. The file size may
not be larger than the memory size. Loadable seg‐
ment entries in the program header table appear in
ascending order, sorted on the p_vaddr member.
PT_DYNAMIC
The array element specifies dynamic linking informa‐
tion.
PT_INTERP
The array element specifies the location and size of
a null-terminated pathname to invoke as an inter‐
preter. This segment type is meaningful only for
executable files (though it may occur for shared
objects). However it may not occur more than once
in a file. If it is present, it must precede any
loadable segment entry.
PT_NOTE
The array element specifies the location of notes
(ElfN_Nhdr).
PT_SHLIB
This segment type is reserved but has unspecified
semantics. Programs that contain an array element
of this type do not conform to the ABI.
PT_PHDR
The array element, if present, specifies the loca‐
tion and size of the program header table itself,
both in the file and in the memory image of the pro‐
gram. This segment type may not occur more than
once in a file. Moreover, it may occur only if the
program header table is part of the memory image of
the program. If it is present, it must precede any
loadable segment entry.
PT_LOPROC, PT_HIPROC
Values in the inclusive range [PT_LOPROC, PT_HIPROC]
are reserved for processor-specific semantics.
PT_GNU_STACK
GNU extension which is used by the Linux kernel to
control the state of the stack via the flags set in
the p_flags member.
p_offset
This member holds the offset from the beginning of the file at
which the first byte of the segment resides.
p_vaddr
This member holds the virtual address at which the first byte
of the segment resides in memory.
p_paddr
On systems for which physical addressing is relevant, this
member is reserved for the segment's physical address. Under
BSD this member is not used and must be zero.
p_filesz
This member holds the number of bytes in the file image of the
segment. It may be zero.
p_memsz
This member holds the number of bytes in the memory image of
the segment. It may be zero.
p_flags
This member holds a bit mask of flags relevant to the segment:
PF_X An executable segment.
PF_W A writable segment.
PF_R A readable segment.
A text segment commonly has the flags PF_X and PF_R. A data
segment commonly has PF_W and PF_R.
p_align
This member holds the value to which the segments are aligned
in memory and in the file. Loadable process segments must
have congruent values for p_vaddr and p_offset, modulo the
page size. Values of zero and one mean no alignment is
required. Otherwise, p_align should be a positive, integral
power of two, and p_vaddr should equal p_offset, modulo
p_align.
Section header (Shdr) A file's section header table lets one locate all the file's sec‐ tions. The section header table is an array of Elf32_Shdr or Elf64_Shdr structures. The ELF header's e_shoff member gives the byte offset from the beginning of the file to the section header ta‐ ble. e_shnum holds the number of entries the section header table contains. e_shentsize holds the size in bytes of each entry.
A section header table index is a subscript into this array. Some
section header table indices are reserved: the initial entry and the
indices between SHN_LORESERVE and SHN_HIRESERVE. The initial entry
is used in ELF extensions for e_phnum, e_shnum and e_shstrndx; in
other cases, each field in the initial entry is set to zero. An
object file does not have sections for these special indices:
SHN_UNDEF
This value marks an undefined, missing, irrelevant, or other‐
wise meaningless section reference.
SHN_LORESERVE
This value specifies the lower bound of the range of reserved
indices.
SHN_LOPROC, SHN_HIPROC
Values greater in the inclusive range [SHN_LOPROC, SHN_HIPROC]
are reserved for processor-specific semantics.
SHN_ABS
This value specifies the absolute value for the corresponding
reference. For example, a symbol defined relative to section
number SHN_ABS has an absolute value and is not affected by
relocation.
SHN_COMMON
Symbols defined relative to this section are common symbols,
such as FORTRAN COMMON or unallocated C external variables.
SHN_HIRESERVE
This value specifies the upper bound of the range of reserved
indices. The system reserves indices between SHN_LORESERVE
and SHN_HIRESERVE, inclusive. The section header table does
not contain entries for the reserved indices.
The section header has the following structure:
typedef struct {
uint32_t sh_name;
uint32_t sh_type;
uint32_t sh_flags;
Elf32_Addr sh_addr;
Elf32_Off sh_offset;
uint32_t sh_size;
uint32_t sh_link;
uint32_t sh_info;
uint32_t sh_addralign;
uint32_t sh_entsize;
} Elf32_Shdr;
typedef struct {
uint32_t sh_name;
uint32_t sh_type;
uint64_t sh_flags;
Elf64_Addr sh_addr;
Elf64_Off sh_offset;
uint64_t sh_size;
uint32_t sh_link;
uint32_t sh_info;
uint64_t sh_addralign;
uint64_t sh_entsize;
} Elf64_Shdr;
No real differences exist between the 32-bit and 64-bit section head‐
ers.
sh_name
This member specifies the name of the section. Its value is
an index into the section header string table section, giving
the location of a null-terminated string.
sh_type
This member categorizes the section's contents and semantics.
SHT_NULL
This value marks the section header as inactive. It
does not have an associated section. Other members of
the section header have undefined values.
SHT_PROGBITS
This section holds information defined by the program,
whose format and meaning are determined solely by the
program.
SHT_SYMTAB
This section holds a symbol table. Typically,
SHT_SYMTAB provides symbols for link editing, though it
may also be used for dynamic linking. As a complete
symbol table, it may contain many symbols unnecessary
for dynamic linking. An object file can also contain a
SHT_DYNSYM section.
SHT_STRTAB
This section holds a string table. An object file may
have multiple string table sections.
SHT_RELA
This section holds relocation entries with explicit
addends, such as type Elf32_Rela for the 32-bit class
of object files. An object may have multiple reloca‐
tion sections.
SHT_HASH
This section holds a symbol hash table. An object par‐
ticipating in dynamic linking must contain a symbol
hash table. An object file may have only one hash ta‐
ble.
SHT_DYNAMIC
This section holds information for dynamic linking. An
object file may have only one dynamic section.
SHT_NOTE
This section holds notes (ElfN_Nhdr).
SHT_NOBITS
A section of this type occupies no space in the file
but otherwise resembles SHT_PROGBITS. Although this
section contains no bytes, the sh_offset member con‐
tains the conceptual file offset.
SHT_REL
This section holds relocation offsets without explicit
addends, such as type Elf32_Rel for the 32-bit class of
object files. An object file may have multiple reloca‐
tion sections.
SHT_SHLIB
This section is reserved but has unspecified semantics.
SHT_DYNSYM
This section holds a minimal set of dynamic linking
symbols. An object file can also contain a SHT_SYMTAB
section.
SHT_LOPROC, SHT_HIPROC
Values in the inclusive range [SHT_LOPROC, SHT_HIPROC]
are reserved for processor-specific semantics.
SHT_LOUSER
This value specifies the lower bound of the range of
indices reserved for application programs.
SHT_HIUSER
This value specifies the upper bound of the range of
indices reserved for application programs. Section
types between SHT_LOUSER and SHT_HIUSER may be used by
the application, without conflicting with current or
future system-defined section types.
sh_flags
Sections support one-bit flags that describe miscellaneous
attributes. If a flag bit is set in sh_flags, the attribute
is "on" for the section. Otherwise, the attribute is "off" or
does not apply. Undefined attributes are set to zero.
SHF_WRITE
This section contains data that should be writable dur‐
ing process execution.
SHF_ALLOC
This section occupies memory during process execution.
Some control sections do not reside in the memory image
of an object file. This attribute is off for those
sections.
SHF_EXECINSTR
This section contains executable machine instructions.
SHF_MASKPROC
All bits included in this mask are reserved for proces‐
sor-specific semantics.
sh_addr
If this section appears in the memory image of a process, this
member holds the address at which the section's first byte
should reside. Otherwise, the member contains zero.
sh_offset
This member's value holds the byte offset from the beginning
of the file to the first byte in the section. One section
type, SHT_NOBITS, occupies no space in the file, and its
sh_offset member locates the conceptual placement in the file.
sh_size
This member holds the section's size in bytes. Unless the
section type is SHT_NOBITS, the section occupies sh_size bytes
in the file. A section of type SHT_NOBITS may have a nonzero
size, but it occupies no space in the file.
sh_link
This member holds a section header table index link, whose
interpretation depends on the section type.
sh_info
This member holds extra information, whose interpretation
depends on the section type.
sh_addralign
Some sections have address alignment constraints. If a sec‐
tion holds a doubleword, the system must ensure doubleword
alignment for the entire section. That is, the value of
sh_addr must be congruent to zero, modulo the value of
sh_addralign. Only zero and positive integral powers of two
are allowed. The value 0 or 1 means that the section has no
alignment constraints.
sh_entsize
Some sections hold a table of fixed-sized entries, such as a
symbol table. For such a section, this member gives the size
in bytes for each entry. This member contains zero if the
section does not hold a table of fixed-size entries.
Various sections hold program and control information:
.bss This section holds uninitialized data that contributes to the
program's memory image. By definition, the system initializes
the data with zeros when the program begins to run. This sec‐
tion is of type SHT_NOBITS. The attribute types are SHF_ALLOC
and SHF_WRITE.
.comment
This section holds version control information. This section
is of type SHT_PROGBITS. No attribute types are used.
.ctors This section holds initialized pointers to the C++ constructor
functions. This section is of type SHT_PROGBITS. The
attribute types are SHF_ALLOC and SHF_WRITE.
.data This section holds initialized data that contribute to the
program's memory image. This section is of type SHT_PROGBITS.
The attribute types are SHF_ALLOC and SHF_WRITE.
.data1 This section holds initialized data that contribute to the
program's memory image. This section is of type SHT_PROGBITS.
The attribute types are SHF_ALLOC and SHF_WRITE.
.debug This section holds information for symbolic debugging. The
contents are unspecified. This section is of type SHT_PROG‐
BITS. No attribute types are used.
.dtors This section holds initialized pointers to the C++ destructor
functions. This section is of type SHT_PROGBITS. The
attribute types are SHF_ALLOC and SHF_WRITE.
.dynamic
This section holds dynamic linking information. The section's
attributes will include the SHF_ALLOC bit. Whether the
SHF_WRITE bit is set is processor-specific. This section is
of type SHT_DYNAMIC. See the attributes above.
.dynstr
This section holds strings needed for dynamic linking, most
commonly the strings that represent the names associated with
symbol table entries. This section is of type SHT_STRTAB.
The attribute type used is SHF_ALLOC.
.dynsym
This section holds the dynamic linking symbol table. This
section is of type SHT_DYNSYM. The attribute used is
SHF_ALLOC.
.fini This section holds executable instructions that contribute to
the process termination code. When a program exits normally
the system arranges to execute the code in this section. This
section is of type SHT_PROGBITS. The attributes used are
SHF_ALLOC and SHF_EXECINSTR.
.gnu.version
This section holds the version symbol table, an array of
ElfN_Half elements. This section is of type SHT_GNU_versym.
The attribute type used is SHF_ALLOC.
.gnu.version_d
This section holds the version symbol definitions, a table of
ElfN_Verdef structures. This section is of type
SHT_GNU_verdef. The attribute type used is SHF_ALLOC.
.gnu.version_r
This section holds the version symbol needed elements, a table
of ElfN_Verneed structures. This section is of type
SHT_GNU_versym. The attribute type used is SHF_ALLOC.
.got This section holds the global offset table. This section is
of type SHT_PROGBITS. The attributes are processor-specific.
.hash This section holds a symbol hash table. This section is of
type SHT_HASH. The attribute used is SHF_ALLOC.
.init This section holds executable instructions that contribute to
the process initialization code. When a program starts to run
the system arranges to execute the code in this section before
calling the main program entry point. This section is of type
SHT_PROGBITS. The attributes used are SHF_ALLOC and
SHF_EXECINSTR.
.interp
This section holds the pathname of a program interpreter. If
the file has a loadable segment that includes the section, the
section's attributes will include the SHF_ALLOC bit. Other‐
wise, that bit will be off. This section is of type SHT_PROG‐
BITS.
.line This section holds line number information for symbolic debug‐
ging, which describes the correspondence between the program
source and the machine code. The contents are unspecified.
This section is of type SHT_PROGBITS. No attribute types are
used.
.note This section holds various notes. This section is of type
SHT_NOTE. No attribute types are used.
.note.ABI-tag
This section is used to declare the expected run-time ABI of
the ELF image. It may include the operating system name and
its run-time versions. This section is of type SHT_NOTE. The
only attribute used is SHF_ALLOC.
.note.gnu.build-id
This section is used to hold an ID that uniquely identifies
the contents of the ELF image. Different files with the same
build ID should contain the same executable content. See the
--build-id option to the GNU linker (ld (1)) for more details.
This section is of type SHT_NOTE. The only attribute used is
SHF_ALLOC.
.note.GNU-stack
This section is used in Linux object files for declaring stack
attributes. This section is of type SHT_PROGBITS. The only
attribute used is SHF_EXECINSTR. This indicates to the GNU
linker that the object file requires an executable stack.
.note.openbsd.ident
OpenBSD native executables usually contain this section to
identify themselves so the kernel can bypass any compatibility
ELF binary emulation tests when loading the file.
.plt This section holds the procedure linkage table. This section
is of type SHT_PROGBITS. The attributes are processor-spe‐
cific.
.relNAME
This section holds relocation information as described below.
If the file has a loadable segment that includes relocation,
the section's attributes will include the SHF_ALLOC bit. Oth‐
erwise, the bit will be off. By convention, "NAME" is sup‐
plied by the section to which the relocations apply. Thus a
relocation section for .text normally would have the name
.rel.text. This section is of type SHT_REL.
.relaNAME
This section holds relocation information as described below.
If the file has a loadable segment that includes relocation,
the section's attributes will include the SHF_ALLOC bit. Oth‐
erwise, the bit will be off. By convention, "NAME" is sup‐
plied by the section to which the relocations apply. Thus a
relocation section for .text normally would have the name
.rela.text. This section is of type SHT_RELA.
.rodata
This section holds read-only data that typically contributes
to a nonwritable segment in the process image. This section
is of type SHT_PROGBITS. The attribute used is SHF_ALLOC.
.rodata1
This section holds read-only data that typically contributes
to a nonwritable segment in the process image. This section
is of type SHT_PROGBITS. The attribute used is SHF_ALLOC.
.shstrtab
This section holds section names. This section is of type
SHT_STRTAB. No attribute types are used.
.strtab
This section holds strings, most commonly the strings that
represent the names associated with symbol table entries. If
the file has a loadable segment that includes the symbol
string table, the section's attributes will include the
SHF_ALLOC bit. Otherwise, the bit will be off. This section
is of type SHT_STRTAB.
.symtab
This section holds a symbol table. If the file has a loadable
segment that includes the symbol table, the section's
attributes will include the SHF_ALLOC bit. Otherwise, the bit
will be off. This section is of type SHT_SYMTAB.
.text This section holds the "text", or executable instructions, of
a program. This section is of type SHT_PROGBITS. The
attributes used are SHF_ALLOC and SHF_EXECINSTR.
String and symbol tables String table sections hold null-terminated character sequences, com‐ monly called strings. The object file uses these strings to repre‐ sent symbol and section names. One references a string as an index into the string table section. The first byte, which is index zero, is defined to hold a null byte ('\0'). Similarly, a string table's last byte is defined to hold a null byte, ensuring null termination for all strings.
An object file's symbol table holds information needed to locate and
relocate a program's symbolic definitions and references. A symbol
table index is a subscript into this array.
typedef struct {
uint32_t st_name;
Elf32_Addr st_value;
uint32_t st_size;
unsigned char st_info;
unsigned char st_other;
uint16_t st_shndx;
} Elf32_Sym;
typedef struct {
uint32_t st_name;
unsigned char st_info;
unsigned char st_other;
uint16_t st_shndx;
Elf64_Addr st_value;
uint64_t st_size;
} Elf64_Sym;
The 32-bit and 64-bit versions have the same members, just in a dif‐
ferent order.
st_name
This member holds an index into the object file's symbol
string table, which holds character representations of the
symbol names. If the value is nonzero, it represents a string
table index that gives the symbol name. Otherwise, the symbol
has no name.
st_value
This member gives the value of the associated symbol.
st_size
Many symbols have associated sizes. This member holds zero if
the symbol has no size or an unknown size.
st_info
This member specifies the symbol's type and binding
attributes:
STT_NOTYPE
The symbol's type is not defined.
STT_OBJECT
The symbol is associated with a data object.
STT_FUNC
The symbol is associated with a function or other exe‐
cutable code.
STT_SECTION
The symbol is associated with a section. Symbol table
entries of this type exist primarily for relocation and
normally have STB_LOCAL bindings.
STT_FILE
By convention, the symbol's name gives the name of the
source file associated with the object file. A file
symbol has STB_LOCAL bindings, its section index is
SHN_ABS, and it precedes the other STB_LOCAL symbols of
the file, if it is present.
STT_LOPROC, STT_HIPROC
Values in the inclusive range [STT_LOPROC, STT_HIPROC]
are reserved for processor-specific semantics.
STB_LOCAL
Local symbols are not visible outside the object file
containing their definition. Local symbols of the same
name may exist in multiple files without interfering
with each other.
STB_GLOBAL
Global symbols are visible to all object files being
combined. One file's definition of a global symbol
will satisfy another file's undefined reference to the
same symbol.
STB_WEAK
Weak symbols resemble global symbols, but their defini‐
tions have lower precedence.
STB_LOPROC, STB_HIPROC
Values in the inclusive range [STB_LOPROC, STB_HIPROC]
are reserved for processor-specific semantics.
There are macros for packing and unpacking the binding and
type fields:
ELF32_ST_BIND(info), ELF64_ST_BIND(info)
Extract a binding from an st_info value.
ELF32_ST_TYPE(info), ELF64_ST_TYPE(info)
Extract a type from an st_info value.
ELF32_ST_INFO(bind, type), ELF64_ST_INFO(bind, type)
Convert a binding and a type into an st_info value.
st_other
This member defines the symbol visibility.
STV_DEFAULT
Default symbol visibility rules. Global and weak sym‐
bols are available to other modules; references in the
local module can be interposed by definitions in other
modules.
STV_INTERNAL
Processor-specific hidden class.
STV_HIDDEN
Symbol is unavailable to other modules; references in
the local module always resolve to the local symbol
(i.e., the symbol can't be interposed by definitions in
other modules).
STV_PROTECTED
Symbol is available to other modules, but references in
the local module always resolve to the local symbol.
There are macros for extracting the visibility type:
ELF32_ST_VISIBILITY(other) or ELF64_ST_VISIBILITY(other)
st_shndx
Every symbol table entry is "defined" in relation to some sec‐
tion. This member holds the relevant section header table
index.
Relocation entries (Rel & Rela) Relocation is the process of connecting symbolic references with sym‐ bolic definitions. Relocatable files must have information that describes how to modify their section contents, thus allowing exe‐ cutable and shared object files to hold the right information for a process's program image. Relocation entries are these data.
Relocation structures that do not need an addend:
typedef struct {
Elf32_Addr r_offset;
uint32_t r_info;
} Elf32_Rel;
typedef struct {
Elf64_Addr r_offset;
uint64_t r_info;
} Elf64_Rel;
Relocation structures that need an addend:
typedef struct {
Elf32_Addr r_offset;
uint32_t r_info;
int32_t r_addend;
} Elf32_Rela;
typedef struct {
Elf64_Addr r_offset;
uint64_t r_info;
int64_t r_addend;
} Elf64_Rela;
r_offset
This member gives the location at which to apply the reloca‐
tion action. For a relocatable file, the value is the byte
offset from the beginning of the section to the storage unit
affected by the relocation. For an executable file or shared
object, the value is the virtual address of the storage unit
affected by the relocation.
r_info This member gives both the symbol table index with respect to
which the relocation must be made and the type of relocation
to apply. Relocation types are processor-specific. When the
text refers to a relocation entry's relocation type or symbol
table index, it means the result of applying ELF[32|64]_R_TYPE
or ELF[32|64]_R_SYM, respectively, to the entry's r_info mem‐
ber.
r_addend
This member specifies a constant addend used to compute the
value to be stored into the relocatable field.
Dynamic tags (Dyn) The .dynamic section contains a series of structures that hold rele‐ vant dynamic linking information. The d_tag member controls the interpretation of d_un.
typedef struct {
Elf32_Sword d_tag;
union {
Elf32_Word d_val;
Elf32_Addr d_ptr;
} d_un;
} Elf32_Dyn;
extern Elf32_Dyn _DYNAMIC[];
typedef struct {
Elf64_Sxword d_tag;
union {
Elf64_Xword d_val;
Elf64_Addr d_ptr;
} d_un;
} Elf64_Dyn;
extern Elf64_Dyn _DYNAMIC[];
d_tag This member may have any of the following values:
DT_NULL Marks end of dynamic section
DT_NEEDED String table offset to name of a needed library
DT_PLTRELSZ Size in bytes of PLT relocation entries
DT_PLTGOT Address of PLT and/or GOT
DT_HASH Address of symbol hash table
DT_STRTAB Address of string table
DT_SYMTAB Address of symbol table
DT_RELA Address of Rela relocation table
DT_RELASZ Size in bytes of the Rela relocation table
DT_RELAENT Size in bytes of a Rela relocation table entry
DT_STRSZ Size in bytes of string table
DT_SYMENT Size in bytes of a symbol table entry
DT_INIT Address of the initialization function
DT_FINI Address of the termination function
DT_SONAME String table offset to name of shared object
DT_RPATH String table offset to library search path (depre‐
cated)
DT_SYMBOLIC Alert linker to search this shared object before
the executable for symbols
DT_REL Address of Rel relocation table
DT_RELSZ Size in bytes of Rel relocation table
DT_RELENT Size in bytes of a Rel table entry
DT_PLTREL Type of relocation entry to which the PLT refers
(Rela or Rel)
DT_DEBUG Undefined use for debugging
DT_TEXTREL Absence of this entry indicates that no relocation
entries should apply to a nonwritable segment
DT_JMPREL Address of relocation entries associated solely
with the PLT
DT_BIND_NOW Instruct dynamic linker to process all relocations
before transferring control to the executable
DT_RUNPATH String table offset to library search path
DT_LOPROC, DT_HIPROC
Values in the inclusive range [DT_LOPROC,
DT_HIPROC] are reserved for processor-specific
semantics
d_val This member represents integer values with various interpreta‐
tions.
d_ptr This member represents program virtual addresses. When inter‐
preting these addresses, the actual address should be computed
based on the original file value and memory base address.
Files do not contain relocation entries to fixup these
addresses.
_DYNAMIC
Array containing all the dynamic structures in the .dynamic
section. This is automatically populated by the linker.
Notes (Nhdr) ELF notes allow for appending arbitrary information for the system to use. They are largely used by core files (e_type of ET_CORE), but many projects define their own set of extensions. For example, the GNU tool chain uses ELF notes to pass information from the linker to the C library.
Note sections contain a series of notes (see the struct definitions
below). Each note is followed by the name field (whose length is
defined in n_namesz) and then by the descriptor field (whose length
is defined in n_descsz) and whose starting address has a 4 byte
alignment. Neither field is defined in the note struct due to their
arbitrary lengths.
An example for parsing out two consecutive notes should clarify their
layout in memory:
void *memory, *name, *desc;
Elf64_Nhdr *note, *next_note;
/* The buffer is pointing to the start of the section/segment */
note = memory;
/* If the name is defined, it follows the note */
name = note->n_namesz == 0 ? NULL : memory + sizeof(*note);
/* If the descriptor is defined, it follows the name
(with alignment) */
desc = note->n_descsz == 0 ? NULL :
memory + sizeof(*note) + ALIGN_UP(note->n_namesz, 4);
/* The next note follows both (with alignment) */
next_note = memory + sizeof(*note) +
ALIGN_UP(note->n_namesz, 4) +
ALIGN_UP(note->n_descsz, 4);
Keep in mind that the interpretation of n_type depends on the names‐
pace defined by the n_namesz field. If the n_namesz field is not set
(e.g., is 0), then there are two sets of notes: one for core files
and one for all other ELF types. If the namespace is unknown, then
tools will usually fallback to these sets of notes as well.
typedef struct {
Elf32_Word n_namesz;
Elf32_Word n_descsz;
Elf32_Word n_type;
} Elf32_Nhdr;
typedef struct {
Elf64_Word n_namesz;
Elf64_Word n_descsz;
Elf64_Word n_type;
} Elf64_Nhdr;
n_namesz
The length of the name field in bytes. The contents will
immediately follow this note in memory. The name is null ter‐
minated. For example, if the name is "GNU", then n_namesz
will be set to 4.
n_descsz
The length of the descriptor field in bytes. The contents
will immediately follow the name field in memory.
n_type Depending on the value of the name field, this member may have
any of the following values:
Core files (e_type = ET_CORE)
Notes used by all core files. These are highly operating
system or architecture specific and often require close
coordination with kernels, C libraries, and debuggers.
These are used when the namespace is the default (i.e.,
n_namesz will be set to 0), or a fallback when the names‐
pace is unknown.
NT_PRSTATUS prstatus struct
NT_FPREGSET fpregset struct
NT_PRPSINFO prpsinfo struct
NT_PRXREG prxregset struct
NT_TASKSTRUCT task structure
NT_PLATFORM String from sysinfo(SI_PLATFORM)
NT_AUXV auxv array
NT_GWINDOWS gwindows struct
NT_ASRS asrset struct
NT_PSTATUS pstatus struct
NT_PSINFO psinfo struct
NT_PRCRED prcred struct
NT_UTSNAME utsname struct
NT_LWPSTATUS lwpstatus struct
NT_LWPSINFO lwpinfo struct
NT_PRFPXREG fprxregset struct
NT_SIGINFO siginfo_t (size might increase over
time)
NT_FILE Contains information about mapped
files
NT_PRXFPREG user_fxsr_struct
NT_PPC_VMX PowerPC Altivec/VMX registers
NT_PPC_SPE PowerPC SPE/EVR registers
NT_PPC_VSX PowerPC VSX registers
NT_386_TLS i386 TLS slots (struct user_desc)
NT_386_IOPERM x86 io permission bitmap (1=deny)
NT_X86_XSTATE x86 extended state using xsave
NT_S390_HIGH_GPRS s390 upper register halves
NT_S390_TIMER s390 timer register
NT_S390_TODCMP s390 time-of-day (TOD) clock com‐
parator register
NT_S390_TODPREG s390 time-of-day (TOD) programmable
register
NT_S390_CTRS s390 control registers
NT_S390_PREFIX s390 prefix register
NT_S390_LAST_BREAK s390 breaking event address
NT_S390_SYSTEM_CALL s390 system call restart data
NT_S390_TDB s390 transaction diagnostic block
NT_ARM_VFP ARM VFP/NEON registers
NT_ARM_TLS ARM TLS register
NT_ARM_HW_BREAK ARM hardware breakpoint registers
NT_ARM_HW_WATCH ARM hardware watchpoint registers
NT_ARM_SYSTEM_CALL ARM system call number
n_name = GNU
Extensions used by the GNU tool chain.
NT_GNU_ABI_TAG
Operating system (OS) ABI information. The desc
field will be 4 words:
· word 0: OS descriptor (ELF_NOTE_OS_LINUX,
ELF_NOTE_OS_GNU, and so on)`
· word 1: major version of the ABI
· word 2: minor version of the ABI
· word 3: subminor version of the ABI
NT_GNU_HWCAP
Synthetic hwcap information. The desc field
begins with two words:
· word 0: number of entries
· word 1: bit mask of enabled entries
Then follow variable-length entries, one byte fol‐
lowed by a null-terminated hwcap name string. The
byte gives the bit number to test if enabled, (1U
<< bit) & bit mask.
NT_GNU_BUILD_ID
Unique build ID as generated by the GNU ld(1)
--build-id option. The desc consists of any
nonzero number of bytes.
NT_GNU_GOLD_VERSION
The desc contains the GNU Gold linker version
used.
Default/unknown namespace (e_type != ET_CORE)
These are used when the namespace is the default (i.e.,
n_namesz will be set to 0), or a fallback when the names‐
pace is unknown.
NT_VERSION A version string of some sort.
NT_ARCH Architecture information.
NOTES top ELF first appeared in System V. The ELF format is an adopted standard.
The extensions for e_phnum, e_shnum and e_shstrndx respectively are
Linux extensions. Sun, BSD and AMD64 also support them; for further
information, look under SEE ALSO.
SEE ALSO top as(1), elfedit(1), gdb(1), ld(1), nm(1), objdump(1), patchelf(1), readelf(1), size(1), strings(1), strip(1), execve(2), dl_iterate_phdr(3), core(5), ld.so(8)
Hewlett-Packard, Elf-64 Object File Format.
Santa Cruz Operation, System V Application Binary Interface.
UNIX System Laboratories, "Object Files", Executable and Linking
Format (ELF).
Sun Microsystems, Linker and Libraries Guide.
AMD64 ABI Draft, System V Application Binary Interface AMD64
Architecture Processor Supplement.
COLOPHON top This page is part of release 5.06 of the Linux man-pages project. A description of the project, information about reporting bugs, and the latest version of this page, can be found at https://www.kernel.org/doc/man-pages/.
Linux 2020-04-11 ELF(5)
mark
mark