hermit-rs
hermit-rs copied to clipboard
add examples to instrument RustyHermit applications
Add an example to show, how we are able to trace RustyHermit applications. For instance, the following command builds the demo application with tracing support:
RUSTFLAGS="-Z instrument-mcount" cargo build -p rusty_demo --release --features=instrument
Afterwards, we have to create an output directory trace
in the current working directory and run the application within uhyve
.
mkdir trace uhyve -m 512M target/x86_64-unknown-hermit/debug/pi_sequential
The outputs in the trace folder is compatible to the uftrace's Data Format. However, the file with the symbol names and their addresses are missing. In the current stage, we have to create this file by hand. nm
can be used to create a file with all symbol addresses.
nm -n target/x86_64-unknown-hermit/release/pi_sequential > trace/pi_sequential-nm.sym
Our binary, is relocatable and starts at address 0x0
. But the loader move the binary to a valid start address. Consequently, we have to add to all symbol addresses the start address of the kernel. awk
helps us to modify the start addresses:
awk '{$0="0x"$0; printf "%x", $0+4194304; print substr($0, index($0, " "))}' trace/pi_sequential-nm.sym > trace/pi_sequential.sym
In this case, the kernel messages show that the start address is located at 0x400000
(= 4194304
) and the script adds the start address to all symbol addresses.
To visualize the results, we are able to use Perfetto. In this case, we have to convert the trace files to the Chrome trace viewer format as follows:
uftrace dump --chrome > trace.json
Afterwards the json file can be be uploaded to Perfetto and anaylzed.
Is compiling in release mode required for rftrace?
Is compiling in release mode required for rftrace?
No, you have to enable the feature "instrument". If you do it rftrace will be included.
If it's not required for performance reasons, I'd suggest to compile in debug mode instead. This way commit c476784 "remove lto support [...]" is not needed and can be reverted.
Hm, performance analysis in debug mode isn't really an option. In debug mode is the performance always bad. It is difficult to find possible performance issues.
@stlankes In that case I'd recommend adding a new custom profile for rftrace, so that we don't have to disable lto in release mode for an optional feature.
Can we add a custom profile for LTO? I like that cargo build --release
works without strange error messages that force new users to lookup what is broken and how to set the appropriated flags. Or has that behavior changed and cargo build --release
works fine out of the box now?
@stlankes Do we really need instrumentation for all these applications? How about add one example application and put an explanation on how to use the instrument feature in the README of that application?
@jbreitbart That's a good idea! --release
still needs the correct RUSTFLAG to be set, and until cargo implements per profile rustflags, adding a new profile for lto (e,g, release-lto
) is probably the better solution.
@stlankes Do we really need instrumentation for all these applications? How about add one example application and put an explanation on how to use the instrument feature in the README of that application?
All benchmarks don't use the instrument feature. Only hello_world and rusty_demo use it.
@stlankes Do we really need instrumentation for all these applications? How about add one example application and put an explanation on how to use the instrument feature in the README of that application?
All benchmarks don't use the instrument feature. Only hello_world and rusty_demo use it.
Then why did you change pi_sequential as well? ;) But regardless: why do you want to add it to these benchmarks?
I added a new profile to test LTO support. With following command, we optimize rusty_demo
with LTO support.
RUSTFLAGS="-Clinker-plugin-lto" cargo build -p rusty_demo --profile release-lto -Z unstable-options
However, this PR based on a unstable feature of cargo (see https://doc.rust-lang.org/nightly/cargo/reference/unstable.html#custom-named-profiles).
Then why did you change pi_sequential as well? ;) But regardless: why do you want to add it to these benchmarks?
Ok, I added it to the whole package "rusty_demo". :-)
These aren't benchmarks, just a few examples. I want to show it by a few small examples.
How about add one example application and put an explanation on how to use the instrument feature in the README of that application?
I want to add a description into the wiki. I think that only the major points should be published in the README.
bors try
try
Build succeeded:
How about add one example application and put an explanation on how to use the instrument feature in the README of that application?
I want to add a description into the wiki. I think that only the major points should be published in the README.
I was referring to the README of the one instrument example, not the top level README, but if you prefer to have it all three feel free. I'll remind you once you have to update them in case of a change of the tracing. :smile:
Are the examples build with tracing enabled? Can we do that as part of the CI pipeline?
Per default tracing is disabled. But I can provide an example for the pipeline
bors try
bors try
bors try
bors try
bors try
I need to revise the PR. I currently I revise @tlambertz crate rftrace. It doesn't work with the latest nightly compiler.
Compiling on latest nightly is a bit annoying. I haven't tested rftrace recently, but have been using uftrace some.
Upstream LLVM changed how the mcount symbol is added, and you now have to manually specify LLVM passes. I also found it helpful to force frame pointers, not sure if relevant here.
The required passes depend on the rust version.
Older versions require:
CARGO_ENCODED_RUSTFLAGS=$'-C\x1fforce-frame-pointers=yes\x1f-Zinstrument-mcount\x1f-C\x1fpasses=ee-instrument post-inline-ee-instrument'
(there are spaces in the passes argument, so we have to use encoded rustflags)
Newer versions:
RUSTFLAGS=$'-C force-frame-pointers=yes -Zinstrument-mcount -Cpasses=ee-instrument<post-inline>'
There might be issues with LTO and mcount as well, where inlined functions still include the mcount call. I haven't investigated anything here, just going from reports in the issue.
Relevant issues are: https://github.com/rust-lang/rust/issues/92109 https://github.com/namhyung/uftrace/issues/1392 https://github.com/llvm/llvm-project/issues/52853 https://github.com/rust-lang/rust/pull/96238