bcc icon indicating copy to clipboard operation
bcc copied to clipboard

bcc-lua produced on Debian Buster (gcc 6.3, llvm 3.9, cmake 3.7.2) doesn't export bcc_* functions

Open aktau opened this issue 8 years ago • 3 comments

See the title. This causes the following:

$ sudo env LUA_CPATH="/home/aktau/opt/lib/?.so" /home/aktau/opt/bin/bcc-lua -- /home/aktau/opt/share/bcc/examples/lua/bashreadline.lua                                                    [sudo] password for aktau:                                                                                             
[ERROR] bcc.lua:2269: /home/aktau/opt/bin/bcc-lua: undefined symbol: bcc_resolve_symname
stack traceback:
	[C]: in function '__index'
	bcc.lua:2269: in function 'check_path_symbol'
	bcc.lua:2129: in function 'attach_uprobe'
	...e/home/aktau/opt/share/bcc/examples/lua/bashreadline.lua:21: in function <...e/home/aktau/opt/share/bcc/examples/lua/bashreadline.lua:19>
	[C]: in function 'xpcall'
	bcc.lua:5896: in function <bcc.lua:5845>
	[C]: at 0x007e3760

So the static bcc-lua doesn't work. bcc-probe does, though:

$ sudo env LD_LIBRARY_PATH=$HOME/opt/lib ./bcc/src/lua/bcc-probe /home/aktau/github/bcc/examples/lua/task_switch.lua
Press any key...
task_switch[0 -> 11288] = 204
task_switch[1160 -> 0] = 11
task_switch[0 -> 1664] = 2
task_switch[15843 -> 0] = 62
task_switch[0 -> 3509] = 1
task_switch[0 -> 1538] = 22
task_switch[0 -> 2297] = 1
task_switch[0 -> 2263] = 1

Indicating that my libbcc.so is functional. After trawling the internet a bit and reading the code in src/lua/bcc/run.lua, it seems like this could happen because the symbols are not exported when they're including in a (static) executable. (Here's the commit introducing the static-ness: https://github.com/iovisor/bcc/commit/02f66f4bbc51a401e2da7f98fefeb2dcba71db2e.)

Checking the number and type of symbols in the binary:

nm -D $(which bcc-lua) | grep -oP '\b[A-Z]\b' | sort | uniq -c | sort -n
     98 R
    164 B
    198 D
    385 U
   3331 V
   4818 W
  20809 T

Together with information from man nm:

           "B"
           "b" The symbol is in the uninitialized data section (known as BSS).

           "D"
           "d" The symbol is in the initialized data section.

           "R"
           "r" The symbol is in a read only data section.

           "T"
           "t" The symbol is in the text (code) section.

           "U" The symbol is undefined.

           "u" The symbol is a unique global symbol.  This is a GNU extension to the standard set of ELF symbol bindings.  For such a symbol the dynamic linker will make sure that in the entire process there is just one symbol with this name and
               type in use.

           "V"
           "v" The symbol is a weak object.  When a weak defined symbol is linked with a normal defined symbol, the normal defined symbol is used with no error.  When a weak undefined symbol is linked and the symbol is not defined, the value of the
               weak symbol becomes zero with no error.  On some systems, uppercase indicates that a default value has been specified.

           "W"
           "w" The symbol is a weak symbol that has not been specifically tagged as a weak object symbol.  When a weak defined symbol is linked with a normal defined symbol, the normal defined symbol is used with no error.  When a weak undefined
               symbol is linked and the symbol is not defined, the value of the symbol is determined in a system-specific manner without error.  On some systems, uppercase indicates that a default value has been specified.

This SO post mentions that T means exported while t means unexported. I wonder why I can't see unexported symbols. Is bcc_resolve_symname even present?

$ strings /home/aktau/opt/lib/libbcc.so | grep bcc_resolve_symname
bcc_resolve_symname
_ZZ19bcc_resolve_symnameE14default_option
bcc_resolve_symname

$ strings /home/aktau/opt/bin/bcc-lua | grep bcc_resolve_symname
bcc_resolve_symname
int bcc_resolve_symname(const char *module, const char *symname, const uint64_t addr,

There seems to be one more reference in libbcc.so. The first one in bcc-lua may be a red herring: it may be a literal string. Should we trust nm when it says that there are no exported symbols? As an added check, let's see whether nm can see the exported symbol in libbcc.so:

$ nm -D /home/aktau/opt/lib/libbcc.so.0.3.0 | grep bcc_resolve_symname
000000000041dad0 T bcc_resolve_symname

OK. There has to be some sort of linker flag that just exports everything.

aktau avatar Sep 19 '17 09:09 aktau

Mentioning @vmg, who is the author of the standalone build.

I colleague of mine helped me get a working executable by fiddling around with the CMakeLists.txt files. After bissecting which of the changes actually made a difference, this single-line diff made it work:

diff --git a/src/cc/CMakeLists.txt b/src/cc/CMakeLists.txt
index fd6037a..4d6e35b 100644
--- a/src/cc/CMakeLists.txt
+++ b/src/cc/CMakeLists.txt
@@ -42,7 +42,7 @@ set_target_properties(bcc-shared PROPERTIES OUTPUT_NAME bcc)
 add_library(bcc-loader-static STATIC ${bcc_sym_sources} ${bcc_util_sources})
 target_link_libraries(bcc-loader-static elf)
 add_library(bcc-static STATIC
-  ${bcc_common_sources} ${bcc_table_sources} ${bcc_util_sources})
+  ${bcc_common_sources} ${bcc_table_sources} ${bcc_util_sources} ${bcc_sym_sources})
 set_target_properties(bcc-static PROPERTIES OUTPUT_NAME bcc)
 
 include(clang_libs)

TL;DR: libbcc.a, which is linked in by bcc-lua(1) did not include the bcc_sym_sources (defined as):

set(bcc_sym_sources bcc_syms.cc bcc_elf.c bcc_perf_map.c bcc_proc.c)

bcc_resolve_symname() is defined in bcc_syms.cc. Now my invocations of bcc-lua work (at least bashreadline.lua`). I'm not sure if I'm missing other libraries, but it's much better than before.

As stated in the last post, this doesn't seem to be a problem for libbcc.so, so perhaps we should just unify the sources lists for both the shared and the static instance of the library.

This bug gives me the Heisenbug feeling. How could this ever have worked? If not, what's wrong with my system?

aktau avatar Sep 19 '17 12:09 aktau

Thanks for the bug report and the extensive research, @aktau!

I'm not exactly sure, but it does give me some Heisenbug feelings too. I think the underlying cause is the bcc loader finding and loading bcc_resolve_symname off the system's libbcc -- which wouldn't happen in a fresh build on your machine.

Either way, I'm pretty sure your fix is correct and we should be unifying the source lists. Would you care to open a PR for that?

vmg avatar Sep 21 '17 13:09 vmg

See the title. This causes the following:

$ sudo env LUA_CPATH="/home/aktau/opt/lib/?.so" /home/aktau/opt/bin/bcc-lua -- /home/aktau/opt/share/bcc/examples/lua/bashreadline.lua                                                    [sudo] password for aktau:                                                                                             
[ERROR] bcc.lua:2269: /home/aktau/opt/bin/bcc-lua: undefined symbol: bcc_resolve_symname
stack traceback:
	[C]: in function '__index'
	bcc.lua:2269: in function 'check_path_symbol'
	bcc.lua:2129: in function 'attach_uprobe'
	...e/home/aktau/opt/share/bcc/examples/lua/bashreadline.lua:21: in function <...e/home/aktau/opt/share/bcc/examples/lua/bashreadline.lua:19>
	[C]: in function 'xpcall'
	bcc.lua:5896: in function <bcc.lua:5845>
	[C]: at 0x007e3760

So the static bcc-lua doesn't work. bcc-probe does, though:

$ sudo env LD_LIBRARY_PATH=$HOME/opt/lib ./bcc/src/lua/bcc-probe /home/aktau/github/bcc/examples/lua/task_switch.lua
Press any key...
task_switch[0 -> 11288] = 204
task_switch[1160 -> 0] = 11
task_switch[0 -> 1664] = 2
task_switch[15843 -> 0] = 62
task_switch[0 -> 3509] = 1
task_switch[0 -> 1538] = 22
task_switch[0 -> 2297] = 1
task_switch[0 -> 2263] = 1

Indicating that my libbcc.so is functional. After trawling the internet a bit and reading the code in src/lua/bcc/run.lua, it seems like this could happen because the symbols are not exported when they're including in a (static) executable. (Here's the commit introducing the static-ness: 02f66f4.)

Checking the number and type of symbols in the binary:

nm -D $(which bcc-lua) | grep -oP '\b[A-Z]\b' | sort | uniq -c | sort -n
     98 R
    164 B
    198 D
    385 U
   3331 V
   4818 W
  20809 T

Together with information from man nm:

           "B"
           "b" The symbol is in the uninitialized data section (known as BSS).

           "D"
           "d" The symbol is in the initialized data section.

           "R"
           "r" The symbol is in a read only data section.

           "T"
           "t" The symbol is in the text (code) section.

           "U" The symbol is undefined.

           "u" The symbol is a unique global symbol.  This is a GNU extension to the standard set of ELF symbol bindings.  For such a symbol the dynamic linker will make sure that in the entire process there is just one symbol with this name and
               type in use.

           "V"
           "v" The symbol is a weak object.  When a weak defined symbol is linked with a normal defined symbol, the normal defined symbol is used with no error.  When a weak undefined symbol is linked and the symbol is not defined, the value of the
               weak symbol becomes zero with no error.  On some systems, uppercase indicates that a default value has been specified.

           "W"
           "w" The symbol is a weak symbol that has not been specifically tagged as a weak object symbol.  When a weak defined symbol is linked with a normal defined symbol, the normal defined symbol is used with no error.  When a weak undefined
               symbol is linked and the symbol is not defined, the value of the symbol is determined in a system-specific manner without error.  On some systems, uppercase indicates that a default value has been specified.

This SO post mentions that T means exported while t means unexported. I wonder why I can't see unexported symbols. Is bcc_resolve_symname even present?

$ strings /home/aktau/opt/lib/libbcc.so | grep bcc_resolve_symname
bcc_resolve_symname
_ZZ19bcc_resolve_symnameE14default_option
bcc_resolve_symname

$ strings /home/aktau/opt/bin/bcc-lua | grep bcc_resolve_symname
bcc_resolve_symname
int bcc_resolve_symname(const char *module, const char *symname, const uint64_t addr,

There seems to be one more reference in libbcc.so. The first one in bcc-lua may be a red herring: it may be a literal string. Should we trust nm when it says that there are no exported symbols? As an added check, let's see whether nm can see the exported symbol in libbcc.so:

$ nm -D /home/aktau/opt/lib/libbcc.so.0.3.0 | grep bcc_resolve_symname
000000000041dad0 T bcc_resolve_symname

OK. There has to be some sort of linker flag that just exports everything.

ManuelHammer2806 avatar Feb 01 '24 22:02 ManuelHammer2806