pprof icon indicating copy to clipboard operation
pprof copied to clipboard

Strange mapping at 0 on Mach-O

Open Timmmm opened this issue 7 years ago • 14 comments

This is with master as of now on OSX. If you create a test as described here. The code will go to symbolizer.go newMapping(). It will first call obj.Open("myexe", 0x100000000, ...) which is correct because the __TEXT segment is mapped there. However it then also tries to call obj.Open("myexe", 0, ... which fails because the code assumes you are trying to symbolise things in the __TEXT segment. As far as I can see the only thing mapped to 0 is the __PAGEZERO segment.

I'm not really sure what is going on here, but something isn't right. It also doesn't seem to cause any obvious issues but maybe I am overlooking something.

Timmmm avatar Aug 14 '18 15:08 Timmmm

Maybe that call is from https://github.com/google/pprof/blob/cf6d271c80826ef4d30d55512b5a77b115bd4b9c/internal/driver/cli.go#L98 where it sniffs the first argument to decide whether it's a binary or a profile?

aalexand avatar Aug 14 '18 15:08 aalexand

If so, perhaps it might make sense to have a separate method to call like o.Obj.Sniff that would cheaply return a boolean whether the file looks like a supported binary instead of actually trying to open it.

aalexand avatar Aug 14 '18 15:08 aalexand

It definitely isn't that - it comes from newMapping(). The error is:

Local symbolization failed for resnet: error reading Mach-O file /path/to/./exe: could not identify base for /path/to/./exe: __TEXT segment address (0x100000000) > mapping start address (0x0)

That message is only generated there. Also I added extra printf debugging and it is definitely an entry in prof.Mapping.

I guess the cli call also generates this error but it doesn't print it, so I think that's fine. And yeah I agree it should use magic numbers rather than trying to completely open the file.

Timmmm avatar Aug 14 '18 16:08 Timmmm

My other guess would be that it could be from https://github.com/google/pprof/blob/cf6d271c80826ef4d30d55512b5a77b115bd4b9c/internal/driver/fetch.go#L437.

aalexand avatar Aug 14 '18 16:08 aalexand

I added a printf there to check and it doesn't seem to be. Makes sense because there is another mapping in that file. I guess this could be a problem with gperftools itself?

Timmmm avatar Aug 15 '18 11:08 Timmmm

Could be. The profiles generated by gperftools should be in text format, can you take a look at what the mapping record section looks like in the input file?

aalexand avatar Aug 15 '18 15:08 aalexand

My profile is in binary format. :-/

Timmmm avatar Aug 15 '18 16:08 Timmmm

What do the mappings look like in pprof -raw <path-to-profile-file>?

aalexand avatar Aug 16 '18 19:08 aalexand

Aha!

Mappings
1: 0x7fffa519d000/0x7fffac3ac000/0x200b9000 /usr/lib/libc++abi.dylib  
2: 0x104855000/0x1048e2000/0x0 /path/to/a.dylib  [FN][FL][IN]
3: 0x0/0xffffffffffffffff/0x0   
4: 0x10496a000/0x104972000/0x0 /path/to/a.dylib  [FN][FL][IN]
5: 0x10001e000/0x100091000/0x0 /path/to/a.dylib  [FN][FL][IN]
6: 0x100000000/0x10000d000/0x0 /path/to/myprogram [FN][FL][IN]
7: 0x7fff97c66000/0x7fff97c6c000/0x0 /usr/lib/system/libdyld.dylib  
8: 0x0/0x100000000/0x0 /path/to/myprogram 
9: 0x100120000/0x102c12000/0x0 /path/to/a.dylib  [FN][FL][IN]
10: 0x7fff97e78000/0x7fff97e81000/0x0 /usr/lib/system/libsystem_platform.dylib  
11: 0x104a39000/0x104a49000/0x0 /path/to/a.dylib  [FN][FL][IN]
12: 0x7fff97deb000/0x7fff97e0a000/0x0 /usr/lib/system/libsystem_malloc.dylib  
13: 0x7fff96853000/0x7fff9687d000/0x0 /usr/lib/libc++abi.dylib  
14: 0x1047c9000/0x104800000/0x0 /path/to/a.dylib  [FN][FL][IN]
15: 0x10497a000/0x1049e4000/0x0 /path/to/a.dylib  [FN][FL][IN]
16: 0x10482a000/0x10483b000/0x0 /path/to/a.dylib  [FN][FL][IN]
17: 0x104a5f000/0x104a83000/0x0 /path/to/a.dylib  [FN][FL][IN]
18: 0x7fff97e81000/0x7fff97e8c000/0x0 /usr/lib/system/libsystem_pthread.dylib  
19: 0x7fff97d80000/0x7fff97da3000/0x0 /usr/lib/system/libsystem_kernel.dylib  
20: 0x104acb000/0x104af0000/0x12000 /path/to/a.dylib  [FN][FL][IN]
21: 0x104801000/0x10482a000/0x38000 /path/to/a.dylib  [FL][IN]
22: 0x100113000/0x100118000/0x8000 /usr/local/lib/libprofiler.0.dylib  [FL][IN]
23: 0x7fff97ca0000/0x7fff97d2e000/0x0 /usr/lib/system/libsystem_c.dylib  
24: 0x7fff967fc000/0x7fff96853000/0x0 /usr/lib/libc++.1.dylib  
25: 0x10010b000/0x100113000/0x0 /usr/local/lib/libprofiler.0.dylib  [FN][FL][IN]
26: 0x104ab9000/0x104acb000/0x0 /path/to/a.dylib  [FN][FL][IN]
27: 0x7fffa0b7c000/0x7fffa0b7e1ce/0x1fa98000 /usr/lib/system/libsystem_malloc.dylib  
28: 0x1048e5000/0x10496a000/0x90000 /path/to/a.dylib  [FN][FL][IN]
29: 0x100096000/0x10010b000/0x78000 /path/to/a.dylib  [FN][FL][IN]
30: 0x102f51000/0x1047c9000/0x2d57000 /path/to/a.dylib  [FN][FL][IN]
31: 0x104b1c000/0x104b21000/0x16000 /path/to/a.1.dylib  [FN][FL][IN]
32: 0x7fffa0b78000/0x7fffa0b7a950/0x1fa94000 /usr/lib/system/libsystem_kernel.dylib  
33: 0x102c12000/0x102f51000/0x2af2000 /path/to/a.dylib  [FN][FL][IN]

(Had to redact the names sorry; all the a.dylibs are different internal libraries that myprogram links to.)

Anyway the problem must be number 8. What do those [FN][FL][IN] things mean? Perhaps the fact that 8 doesn't have them means it shouldn't be loaded or something? 3 also looks suspicious.

Timmmm avatar Aug 17 '18 12:08 Timmmm

Ah apparently [FN] means hasFunctions, [FL] is hasFilenames and [IN] is hasInlineFrames.

Timmmm avatar Aug 17 '18 12:08 Timmmm

@Timmmm Which tool did you use to collect this profile? Is it from gperftools?

aalexand avatar Aug 31 '18 17:08 aalexand

Yeah gperftools.

Timmmm avatar Sep 03 '18 11:09 Timmmm

I am still getting this same "error" using a relatively new version of pprof (commit b9804c9) on macOS Big Sur (darwin 11.4). Edit: Same result on most recent commit as well. I am also collecting profiling data from gperftools (version 2.8) and symbolizing using llvm-symbolizer (llvm11). Testing on the file @Timmmm provided above, and also on other c/cpp files. Full output:

$ clang++ -Wall -g -O0 -fno-pie -c main.cpp -o main.o
$ clang++ -o main main.o -lprofiler -fno-pie
$ CPUPROFILE=main.prof ./main
k = 1:   2   3   5   7  11  13  17  19  23  29
k = 2:   4   6   9  10  14  15  21  22  25  26
k = 3:   8  12  18  20  27  28  30  42  44  45
k = 4:  16  24  36  40  54  56  60  81  84  88
k = 5:  32  48  72  80 108 112 120 162 168 176
k = 6:  64  96 144 160 216 224 240 324 336 352
k = 7: 128 192 288 320 432 448 480 648 672 704
k = 8: 256 384 576 640 864 896 960129613441408
k = 9: 512 76811521280172817921920259226882816
k = 10:1024153623042560345635843840518453765632
k = 11:2048307246085120691271687680103681075211264
k = 12:40966144921610240138241433615360207362150422528
k = 13:8192122881843220480276482867230720414724300845056
k = 14:16384245763686440960552965734461440829448601690112
k = 15:32768491527372881920110592114688122880165888172032180224
k = 16:6553698304147456163840221184229376245760331776344064360448
k = 17:131072196608294912327680442368458752491520663552688128720896
k = 18:262144393216589824655360884736917504983040132710413762561441792
k = 19:52428878643211796481310720176947218350081966080265420827525122883584
k = 20:1048576157286423592962621440353894436700163932160530841655050245767168
PROFILE: interrupts/evictions/bytes = 812/334/22864
$ PPROF_BINARY_PATH=./main
$ pprof -text -trim=false main.prof
Local symbolization failed for libdyld.dylib: stat /usr/lib/system/libdyld.dylib: no such file or directory
Local symbolization failed for main: error reading Mach-O file /<path/to/root/folder>/./main: could not identify base for /<path/to/root/folder>/./main: __TEXT segment address (0x100000000) > mapping start address (0x0)
Some binary filenames not available. Symbolization may be incomplete.
Try setting PPROF_BINARY_PATH to the search path for local binaries.
File: libc++.1.dylib
Type: cpu
Showing nodes accounting for 8.12s, 100% of 8.12s total
      flat  flat%   sum%        cum   cum%
     7.99s 98.40% 98.40%      7.99s 98.40%  k_prime
     0.03s  0.37% 98.77%      7.14s 87.93%  primes
     0.03s  0.37% 99.14%      0.03s  0.37%  std::__1::__compressed_pair_elem::__get
     0.02s  0.25% 99.38%      0.02s  0.25%  [libsystem_kernel.dylib]
     0.02s  0.25% 99.63%      0.05s  0.62%  std::__1::__list_imp::__sz
     0.02s  0.25% 99.88%      0.05s  0.62%  std::__1::list::size
     0.01s  0.12%   100%      0.01s  0.12%  std::__1::locale::use_facet
         0     0%   100%      7.15s 88.05%  <unknown>
         0     0%   100%      0.03s  0.37%  ProfileHandler::SignalHandler
         0     0%   100%      7.15s 88.05%  [libdyld.dylib]
         0     0%   100%      5.47s 67.36%  [libsystem_platform.dylib]
         0     0%   100%      0.22s  2.71%  [main]
         0     0%   100%      7.15s 88.05%  main
         0     0%   100%      0.03s  0.37%  std::__1::__compressed_pair::first
         0     0%   100%      0.01s  0.12%  std::__1::basic_ostream::operator<<

We can ignore the warning for setting PPROF_BINARY_PATH (note this was already done for the local executable). This warning is just reiterating that we do not have symbol information for some binaries (e.g libdyld.dylib). This occurs in Big Sur because Apple changed the filesystem such that *.dylib files can no longer be searched as files within the system. Regardless, symbolizing those libraries won't really matter as we will likely not be optimizing them anyways.

@Timmmm @aalexand Have you guys figured out if the problem is, in fact, with gperftools, and not pprof? If so, are there other profiling tools that work with pprof that do not result in weird symbolization errors. If not, then I might try attempting to see if this persists using other profiling tools.

ayushkarnawat avatar Sep 14 '21 16:09 ayushkarnawat

Ok, by compiling with PIE and ASLR enabled (as suggested in https://github.com/google/pprof/issues/311#issuecomment-363398950), I no longer get the warning that symbolization failed on the local binary ./main. This work on both latest HEAD or commit b9804c9 of pprof. However, I am still seeing <unknown>'s in the pprof output:

Local symbolization failed for libdyld.dylib: stat /usr/lib/system/libdyld.dylib: no such file or directory
Local symbolization failed for libsystem_c.dylib: stat /usr/lib/system/libsystem_c.dylib: no such file or directory
Some binary filenames not available. Symbolization may be incomplete.
Try setting PPROF_BINARY_PATH to the search path for local binaries.
File: libc++.1.dylib
Type: cpu
Showing nodes accounting for 9.13s, 100% of 9.13s total
      flat  flat%   sum%        cum   cum%
     9.01s 98.69% 98.69%      9.01s 98.69%  k_prime
     0.04s  0.44% 99.12%      7.47s 81.82%  primes
     0.03s  0.33% 99.45%      0.03s  0.33%  std::__1::locale::use_facet
     0.02s  0.22% 99.67%      0.04s  0.44%  std::__1::list::size
     0.01s  0.11% 99.78%      0.01s  0.11%  [libsystem_kernel.dylib]
     0.01s  0.11% 99.89%      0.01s  0.11%  std::__1::__compressed_pair::first
     0.01s  0.11%   100%      0.01s  0.11%  std::__1::__compressed_pair_elem::__get
         0     0%   100%      7.50s 82.15%  <unknown>
         0     0%   100%      7.50s 82.15%  [libdyld.dylib]
         0     0%   100%      0.01s  0.11%  [libsystem_c.dylib]
         0     0%   100%      3.65s 39.98%  [libsystem_platform.dylib]
         0     0%   100%      7.50s 82.15%  main
         0     0%   100%      0.02s  0.22%  std::__1::__list_imp::__sz
         0     0%   100%      0.01s  0.11%  std::__1::__stdoutbuf::overflow
         0     0%   100%      0.04s  0.44%  std::__1::basic_ostream::operator<<
         0     0%   100%      0.01s  0.11%  std::__1::basic_ostream::put
         0     0%   100%      0.01s  0.11%  std::__1::endl

Are these <unknown> part of the *.dylib's or something else? How can one check? Since these shared libraries are no longer present as files in MacOS, could it be that llvm-symbolizer/addr2line cannot recognize them and therefore returns <unknown>.

ayushkarnawat avatar Sep 15 '21 18:09 ayushkarnawat