BOLT icon indicating copy to clipboard operation
BOLT copied to clipboard

[C++] Use of tcmalloc leads to assertion failure

Open PierreRamoin-1A opened this issue 5 years ago • 4 comments

Hi, I have a binary which use tcmalloc (from gperftools). tcmalloc creates some new sections for himself (google_malloc and malloc_hook) with AX flags.

When I apply BOLT on this binary, it leads to an assertion when meets the __stop_google_malloc symbol on

llvm::bolt::RewriteInstance::discoverFileObjects(): Assertion `Section && "section for functions must be registered."' failed.

If I force tcmalloc not to create this section, I don't have the assertion anymore.

I assume this have some links with #6, but as the issue moved a lot from the original subject, I preferred open a new one.

Here is an extract of a readelf -e of libtcmalloc.a around the concerned sections:

  [...]
  [481] .gnu.lto_.symtab. PROGBITS         0000000000000000  000560d6
       0000000000000d72  0000000000000000   E       0     0     1
  [482] .gnu.lto_.opts    PROGBITS         0000000000000000  00056e48
       00000000000000bd  0000000000000000   E       0     0     1
  [483] .rodata           PROGBITS         0000000000000000  00056f20
       00000000000008a9  0000000000000000   A       0     0     32
  [484] malloc_hook       PROGBITS         0000000000000000  000577ca
       0000000000000211  0000000000000000  AX       0     0     2
  [485] .relamalloc_hook  RELA             0000000000000000  000a1010
       0000000000000228  0000000000000018   I      802   484     8
  [486] .gcc_except_table PROGBITS         0000000000000000  000579db
       000000000000003b  0000000000000000   A       0     0     1
  [487] .text._ZN4base6su PROGBITS         0000000000000000  00057a16
       0000000000000024  0000000000000000 AXG       0     0     1
  [488] .text._ZN4base6su PROGBITS         0000000000000000  00057a3a
       000000000000001c  0000000000000000 AXG       0     0     1
  [489] .text._ZN4base6su PROGBITS         0000000000000000  00057a56
       0000000000000023  0000000000000000 AXG       0     0     1
  [490] .rela.text._ZN4ba RELA             0000000000000000  000a1238
       0000000000000018  0000000000000018  IG      802   489     8
  [491] .text._ZN4base6su PROGBITS         0000000000000000  00057a79
       000000000000002f  0000000000000000 AXG       0     0     1
  [492] .rela.text._ZN4ba RELA             0000000000000000  000a1250
       0000000000000018  0000000000000018  IG      802   491     8
  [493] .text._ZN4base6su PROGBITS         0000000000000000  00057aa8
       0000000000000011  0000000000000000 AXG       0     0     1
  [...]
  [991] .text._ZN8tcmallo PROGBITS         0000000000000000  000d26a2
       000000000000002f  0000000000000000 AXG       0     0     2
  [992] .rela.text._ZN8tc RELA             0000000000000000  0014c168
       0000000000000018  0000000000000018  IG      1513   991     8
  [993] .text._ZN8tcmallo PROGBITS         0000000000000000  000d26d2
       000000000000002c  0000000000000000 AXG       0     0     2
  [994] .rela.text._ZN8tc RELA             0000000000000000  0014c180
       0000000000000018  0000000000000018  IG      1513   993     8
  [995] google_malloc     PROGBITS         0000000000000000  000d2700
       00000000000043f5  0000000000000000  AX       0     0     64
  [996] .relagoogle_mallo RELA             0000000000000000  0014c198
       0000000000002088  0000000000000018   I      1513   995     8
  [997] .gcc_except_table PROGBITS         0000000000000000  000d6af8
       000000000000014c  0000000000000000   A       0     0     4
  [998] .rela.gcc_except_ RELA             0000000000000000  0014e220
       0000000000000018  0000000000000018   I      1513   997     8
  [999] .text._ZN13TCMall PROGBITS         0000000000000000  000d6c44
       000000000000001f  0000000000000000 AXG       0     0     2
  [1000] .rela.text._ZN13T RELA             0000000000000000  0014e238
       0000000000000018  0000000000000018  IG      1513   999     8
  [...]

and is an extract of the final binary:

  [...]
  [13] .init             PROGBITS         00000000004314f8  000314f8
       000000000000001f  0000000000000000  AX       0     0     4
  [14] .plt              PROGBITS         0000000000431520  00031520
       0000000000000810  0000000000000010  AX       0     0     16
  [15] .text             PROGBITS         0000000000432000  00032000
       00000000018c80f0  0000000000000000  AX       0     0     4096
  [16] google_malloc     PROGBITS         0000000001cfa100  018fa100
       00000000000043f5  0000000000000000  AX       0     0     64
  [17] malloc_hook       PROGBITS         0000000001cfe4f6  018fe4f6
       00000000000005a1  0000000000000000  AX       0     0     2
  [18] .fini             PROGBITS         0000000001cfea98  018fea98
       0000000000000009  0000000000000000  AX       0     0     4
  [19] .rodata           PROGBITS         0000000001cfeac0  018feac0
       0000000000466e0b  0000000000000000   A       0     0     64
  [20] .gcc_except_table PROGBITS         00000000021658cc  01d658cc
       00000000000ed9f6  0000000000000000   A       0     0     4
  [21] .eh_frame         X86_64_UNWIND    00000000022532c8  01e532c8
       0000000000603fac  0000000000000000   A       0     0     8
  [22] .eh_frame_hdr     X86_64_UNWIND    0000000002857274  02457274
       00000000001619cc  0000000000000000   A       0     0     4
  [...]

PierreRamoin-1A avatar Apr 11 '19 07:04 PierreRamoin-1A

Here is some update about this issue:

The problem come along with the symbols ld generates when use a custom section (__start_<section_name> and __stop_<section_name>) and to be more precise the emplacement of the __stop_.

Indeed this symbol is placed at the end address of the section, with a size of 0, and the BinaryContext::getSectionForAddress function doesn't detect the symbols placed at the end_address of sections.

If the section is contiguous with another, the __stop_ symbol is detected on the wrong section, and BOLT still process (possibly dangerous?), but if there is a space between 2 sections, then BOLT is aborting because it detects this symbol as an orphan.

Here is a simple program to compile to be able to reproduce this issue:

#include <iostream>

extern char __start_custom;
extern char __stop_custom;


void po()  __attribute__((section("custom")));

void po()
{
  std::cout << "in po()" << std::endl;
}

int main(int argc, char *argv[])
{
  po();
  std::cout << "Begin of section:" << reinterpret_cast<void *>(&__start_custom) << ", end of section:" << reinterpret_cast<void *>(&__stop_custom) << std::endl;
  return 0;
}

I join a linker script (ld) modified version which changes the .text section alignment and place the custom section just above it to help reproducing the issue (-T at compilation with g++).

script2link.txt

If my problem is quite specific (but linked to a standard feature), this can happen it various cases, and not only on custom sections. As example, g++ 8.2.1 on RHEL8-beta is generating by default some symbols for annobin, and for instance the symbol .annobin_elf_init.c._end seems to be placed at the end of the .text section and with a size of 0.

A workaround would be to allow the getSectionForAddress to accept symbols at the end_address of a section (as #19 suggested). After re adapting the code, BOLT is working, and the symbol is in the ended section only is there is no contiguous section.

But actually I'm not sure it's enough to avoid problems: As I said, if there is a contiguous section, a symbol like __stop_* is detected on the wrong section, and as getSectionFromAddress is only taking an address as parameter, we can't really determinate if this symbol is on the up_bound of the previous section or the low_bound of the next one.

Is BOLT doing some changes on section size and position? Even if it doesn't, is it possible this mis-traduction could lead to execution problems?

For what I saw, the symbols's section detection is based on this function and not on st_shndx, is there a specific reason for that?

Last thing, after done the workaround I talked about, I saw some of these empty symbols get now a real size:

$ eu-nm ./libtcmalloc.so | grep _google_malloc
__start_google_malloc    |000000000004a0c0|GLOBAL|NOTYPE  |0000000000000000|    |google_malloc
__stop_google_malloc     |000000000004e99e|GLOBAL|NOTYPE  |0000000000000000|    |google_malloc
$ eu-nm ./libtcmalloc.so.bolt  | grep _google_malloc
__start_google_malloc    |000000000004a0c0|GLOBAL|NOTYPE  |00000000000003ce|    |google_malloc
__stop_google_malloc     |000000000004e99e|GLOBAL|NOTYPE  |00000000000000cf|    |malloc_hook

Seems strange to me but I don't know if there is a justification for it.

PierreRamoin-1A avatar Apr 29 '19 07:04 PierreRamoin-1A

Thanks for taking time to investigate the issue with tcmalloc. We can add a special handling for __start_*/__stop_* symbols and use st_shndx to better track other symbols placed at the end of a section if no section could be found. However, I suspect that could be just a tip of the iceberg. There's a good chance that those marked section boundaries are being used for some sort of code discovery. As BOLT re-arranges code and shuffles functions, the assumptions e.g. about continuity of the code between __start_* and __stop_* will be broken. A better option is to preserve the code between such boundaries and to not optimize it.

The reason you are seeing sizes being assigned to __start_google_malloc and __stop_google_malloc is that currently they are treated by BOLT as entries to functions, and on the output they are being assigned sizes corresponding to those functions.

maksfb avatar Apr 30 '19 06:04 maksfb

Ok, thanks for these explanations. I agree with you that it's safer to let those sections as is.

I can work on a fix to ignore symols between __start_* and __stop_* on my side if you're ok with that?

PierreRamoin-1A avatar Apr 30 '19 10:04 PierreRamoin-1A

Sounds good to me.

maksfb avatar Apr 30 '19 17:04 maksfb