coz icon indicating copy to clipboard operation
coz copied to clipboard

coz fails silently on my rust program

Open asg0451 opened this issue 3 years ago • 7 comments

When i run my program with coz, it exits without doing anything:

$ cargo b --release && coz run --- ./target/release/cov-breaker >/dev/null

[libcoz.cpp:100] bootstrapping coz
[libcoz.cpp:128] Including MAIN, which is /home/miles/rust/cov-breaker/target/release/cov-breaker
[inspect.cpp:325] /usr/lib/coz-profiler/libcoz.so is not in scope
[inspect.cpp:325] /usr/lib/x86_64-linux-gnu/ld-2.31.so is not in scope
[inspect.cpp:325] /usr/lib/x86_64-linux-gnu/libm-2.31.so is not in scope
[inspect.cpp:325] /usr/lib/x86_64-linux-gnu/libdl-2.31.so is not in scope
[inspect.cpp:325] /usr/lib/x86_64-linux-gnu/libpthread-2.31.so is not in scope
[inspect.cpp:325] /usr/lib/x86_64-linux-gnu/libdwarf++.so.0 is not in scope
[inspect.cpp:325] /usr/lib/x86_64-linux-gnu/libgcc_s.so.1 is not in scope
[inspect.cpp:325] /usr/lib/x86_64-linux-gnu/libelf++.so.0 is not in scope
[inspect.cpp:325] /usr/lib/x86_64-linux-gnu/libc-2.31.so is not in scope
[inspect.cpp:325] /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28 is not in scope
[inspect.cpp:325] /usr/lib/x86_64-linux-gnu/librt-2.31.so is not in scope
[inspect.cpp:509] Included source file /home/miles/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/cmp.rs
[inspect.cpp:509] Included source file /home/miles/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/intrinsics.rs
... like a thousand of these ...
[inspect.cpp:317] Including lines from executable /home/miles/rust/cov-breaker/target/release/cov-breaker
[profiler.cpp:75] Starting profiler thread
(exits immediately)
$ echo $?
245

(somewhat) minimal repro: https://github.com/asg0451/rust-coz-breaker

Anecdotally, coz worked fine in my actual program, until I added rayon & channels & parallelism.

Ubuntu 20.04, coz from apt (I don't see a --version flag)

Requires cargo to build, as with all (most) Rust programs

EDIT: it seems to me that rayon is the issue. replacing

lines.par_iter().for_each_with(tx, |tx, &j| {

with

lines.iter().for_each(|&j| {

(that is - switching from rayon to a regular single-threaded iterator) results in coz working

EDIT 2: turns out that removing rayon still results in an abrupt exit with code 245, it's just not immediate..

asg0451 avatar Mar 29 '21 18:03 asg0451

+1, I see this too.

[profiler.cpp:75] Starting profiler thread --> abruptly exit code 245

colinwm avatar Apr 23 '21 18:04 colinwm

I had this same issue, but was able to fix it with the coz::thread_init() fix mentioned in the README.

rafibaum avatar May 05 '21 13:05 rafibaum

I'm not sure if it's the same issue, but I had a similar issue that would trigger the following output:

[libcoz/profiler.h:123] Thread state not found
Aborted!
  0: /usr/bin/../lib64/libcoz.so(_ZN8profiler8on_errorEiP9siginfo_tPv+0x69) [0x7f45682e84a9]
  1: /usr/lib/libc.so.6(+0x42560) [0x7f45680c4560]
  2: /usr/lib/libc.so.6(+0x8f34c) [0x7f456811134c]
  3: /usr/lib/libc.so.6(raise+0x18) [0x7f45680c44b8]
  4: /usr/lib/libc.so.6(abort+0xd3) [0x7f45680ae534]
  5: /usr/bin/../lib64/libcoz.so(pthread_create+0x199) [0x7f45682e5479]
  6: /usr/bin/../lib64/libcoz.so(_ZN8profiler7startupERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEP4lineib+0x213) [0x7f45682e9553]
  7: /usr/bin/../lib64/libcoz.so(_Z8init_cozv+0xdbb) [0x7f45682e491b]
  8: /usr/bin/../lib64/libcoz.so(+0x185dc) [0x7f45682e55dc]
  9: /usr/lib/libc.so.6(+0x2d310) [0x7f45680af310]
  10: /usr/lib/libc.so.6(__libc_start_main+0x81) [0x7f45680af3c1]
  11: target/release/examples/toy(+0x7b85) [0x5619ab95db85]

And I remember having seen this 245 error code in strace or something. I debugged it and it turned out this happens on Rust programs not linked with pthread. I'm not sure how Rust programs using thread work, but it seems sometimes it does not link pthread (it's not shown in ldd).

The solution I found was to remove | RTLD_NOLOAD from this line.

@llogiq Pinging you since you wrote an article about this issue.

antoyo avatar Mar 09 '22 17:03 antoyo

I was having a similar issue:

[profiler.cpp:75] Starting profiler thread                                                
[libcoz.cpp:96] init_coz in progress, do not recurse
[profiler.h:123] Thread state not found

I could fix the issue by removing the RTLD_NOLOAD flag as also noted by @antoyo. Another workaround that doesn't involve recompiling is using LD_PRELOAD like LD_PRELOAD=/usr/lib/libpthread.so.0.

kalcutter avatar Mar 20 '22 19:03 kalcutter

I'm not sure if it's the same issue, but I had a similar issue that would trigger the following output:

[libcoz/profiler.h:123] Thread state not found
Aborted!
  0: /usr/bin/../lib64/libcoz.so(_ZN8profiler8on_errorEiP9siginfo_tPv+0x69) [0x7f45682e84a9]
  1: /usr/lib/libc.so.6(+0x42560) [0x7f45680c4560]
  2: /usr/lib/libc.so.6(+0x8f34c) [0x7f456811134c]
  3: /usr/lib/libc.so.6(raise+0x18) [0x7f45680c44b8]
  4: /usr/lib/libc.so.6(abort+0xd3) [0x7f45680ae534]
  5: /usr/bin/../lib64/libcoz.so(pthread_create+0x199) [0x7f45682e5479]
  6: /usr/bin/../lib64/libcoz.so(_ZN8profiler7startupERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEP4lineib+0x213) [0x7f45682e9553]
  7: /usr/bin/../lib64/libcoz.so(_Z8init_cozv+0xdbb) [0x7f45682e491b]
  8: /usr/bin/../lib64/libcoz.so(+0x185dc) [0x7f45682e55dc]
  9: /usr/lib/libc.so.6(+0x2d310) [0x7f45680af310]
  10: /usr/lib/libc.so.6(__libc_start_main+0x81) [0x7f45680af3c1]
  11: target/release/examples/toy(+0x7b85) [0x5619ab95db85]

And I remember having seen this 245 error code in strace or something. I debugged it and it turned out this happens on Rust programs not linked with pthread. I'm not sure how Rust programs using thread work, but it seems sometimes it does not link pthread (it's not shown in ldd).

The solution I found was to remove | RTLD_NOLOAD from this line.

@llogiq Pinging you since you wrote an article about this issue.

Trying out coz for the first time, this solved my issue with Thread state not found running the benchmark tests. Thanks :D

hugolm84 avatar Mar 25 '22 10:03 hugolm84

coz fails even more silently for my Rust program, quitting with the 245 error code with no output whatsoever.

EDIT: What I've found out so far:

  • I suspected the mold linker may be the culprit, but using Rust defaults didn't change anything.
  • coz::thread_init() in main() and in background threads didn't help.
  • the LD_PRELOAD hack helped, running LD_PRELOAD=/usr/lib/libpthread.so.0 coz run --- target/release/afx produced
    [libcoz.cpp:100] bootstrapping coz
    [libcoz.cpp:128] Including MAIN, which is /home/viluon/Projects/afx/target/release/afx
    terminate called after throwing an instance of 'dwarf::format_error'
      what():  unknown compilation unit version 5
    
  • @hugolm84's workaround described in https://github.com/plasma-umass/coz/issues/107#issuecomment-1078865405 helped me get coz working with my application. Unfortunately, most (all?) the debug info from my code (not dependencies) seems to be ignored by coz. A 24 MB profile from a 20+ minute run with three progress points doesn't seem to even mention my crate's source code. image image (the main.rs reference in the screenshot points to #[derive(...)] in my code, I think it's attributing another crate's source (the debug info of which coz understands) to a location in my code)

viluon avatar Nov 26 '22 15:11 viluon

I have a similar problem with a C executable:

$ coz run --- ./test
[ some lines I cut out]
[inspect.cpp:316] Including lines from executable /tmp/build/test
[profiler.cpp:75] Starting profiler thread
[libcoz.cpp:96] init_coz in progress, do not recurse
[profiler.h:123] Thread state not found
Aborted!
  0: /usr/bin/../lib64/coz-profiler/libcoz.so(_ZN8profiler8on_errorEiP9siginfo_tPv+0x6c) [0x7fbde5f040ec]
  1: /usr/lib/libc.so.6(+0x389e0) [0x7fbde5b739e0]
  2: /usr/lib/libc.so.6(+0x8864c) [0x7fbde5bc364c]
  3: /usr/lib/libc.so.6(gsignal+0x18) [0x7fbde5b73938]
  4: /usr/lib/libc.so.6(abort+0xd7) [0x7fbde5b5d53d]
  5: /usr/bin/../lib64/coz-profiler/libcoz.so(pthread_create+0x18e) [0x7fbde5f015fe]
  6: /usr/bin/../lib64/coz-profiler/libcoz.so(_ZN8profiler7startupERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEP4lineib+0x20a) [0x7fbde5f0498a]
  7: /usr/bin/../lib64/coz-profiler/libcoz.so(_Z8init_cozv+0xf68) [0x7fbde5f00aa8]
  8: /usr/bin/../lib64/coz-profiler/libcoz.so(+0x1944b) [0x7fbde5f0144b]
  9: /usr/lib/libc.so.6(+0x23290) [0x7fbde5b5e290]
  10: /usr/lib/libc.so.6(__libc_start_main+0x8a) [0x7fbde5b5e34a]
  11: ./test(+0x85d5) [0x563d593a85d5]

I couldn't get coz to work on anything until now. My compile parameters are: -Wall -Wextra -Wold-style-declaration -Wuninitialized -Wmaybe-uninitialized -Wunused-parameter -g1 -gdwarf-3

Are there any suggestions on how to get this working with a plain C program?

0mhu avatar Jan 22 '23 20:01 0mhu