gprof2dot
gprof2dot copied to clipboard
New input filter for FlameGraph collapse output
FlameGraph is another profiling visualization tool. It grokes a simple text-based line-oriented format. It ships with a series of stackcollapse
scripts which generate the expected input from different tools (gdb, instruments, jstack, ljp, perf, GDB based Poor Man's profiler, stap, VTune).
It might be interesting to add a parser for this format:
- Any tool implemented by FlameGraph could be supported directly;
- The format being simple, it is very easy to post-process the stackcollpase data using
grep
(in order to focus on a given part of the execution),sed
,perl
,awk
and friends. One example of this is thestackcollapse-recursive
script which merges direct recursive calls.
I might try to implement this when I have some time.
I'm not sure how practical it is to import data through a third tool, vs import it directly.
But FlameGraph does look like an interesting tool, so I've added a link to from the wiki
One benefit is that if someone anywhere write a some code to write flamegraph.pl
data, he would automatically be able to send it to gprofdot
. For example, I could send the output of this script for the Poor man's profiler
But another great thing about the line oriented format is that it's very easy to create processing pipeline from the shell (although it would probably be simpler if the value was before the symbol):
Some examples:
$ perf record -t $tid
$ perf script | sed 's/.*cycles: *[0-9a-f]* *//' |
python Tools/symbolicate-ppc.py ~/.dolphin-emu/Maps/${map}.map |
rankor -r | head
10.05% JIT_Loop (/tmp/perf-15936.map)
3.73% [unknown] (/tmp/perf-15936.map)
1.91% VideoBackendHardware::Video_GatherPipeBursted (/opt/dolphin-2015-05-06/bin/dolphin-emu)
1.39% JIT_PPC_PSMTXConcat (/tmp/perf-15936.map)
1.00% JIT_PPC_zz_051754c_ (/tmp/perf-15936.map)
0.90% JIT_PPC_zz_051751c_ (/tmp/perf-15936.map)
0.71% JIT_PPC_zz_04339d4_ (/tmp/perf-15936.map)
0.59% JIT_PPC_zz_05173e0_ (/tmp/perf-15936.map)
0.57% JIT_PPC_zz_044141c_ (/tmp/perf-15936.map)
0.54% JIT_PPC_zz_01839cc_ (/tmp/perf-15936.map)
$ perf record --call-graph dwarf -t $tid
$ perf script | stackcollapse-perf.pl | sed 's/^CPU;//' |
python Tools/symbolicate-ppc.py ~/.dolphin-emu/Maps/${map}.map |
perl -pe 's/^([^; ]*).*? ([0-9]+?)$/\1 \2/' | stackcollapse-recursive.pl |
awk '{printf "%s %s\n", $2, $1}' | sort -rn | head
5811 JIT_Loop
2396 [unknown]
577 JIT_PPC_PSMTXConcat
464 JIT_PPC___restore_gpr
396 JIT_PPC_zz_0517514_
313 JIT_PPC_zz_04339d4_
290 JIT_PPC_zz_05173e0_
285 JIT_PPC_zz_01839cc_
277 JIT_PPC_zz_04335ac_
269 JIT_PPC_zz_0420b58_
I could do:
# I'm only interested in what's happening in MAIN():
perf script | stackcollapse-perf.pl | grep MAIN | gprof2dot.py -f stackcollapse | dot -Tpng -o output.png
# I'm not interested in what's happening in init():
perf script | stackcollapse-perf.pl | grep -v init | gprof2dot.py -f stackcollapse | dot -Tpng -o output.png
# Let's pretend that realloc() is the same thing as malloc():
perf script | stackcollapse-perf.pl | sed/realloc/malloc/ | gprof2dot.py -f stackcollapse | dot -Tpng -o output.png
# I want to merge recursive calls:
perf script | stackcollapse-perf.pl | stackfilter-recursive.pl | grep MAIN | gprof2dot.py -f stackcollapse | dot -Tpng -o output.png
Another solution would be to write a separate tool stackcollapse2gprof2dot
:)
I've wanted this forever, so I hacked it up over at https://github.com/richvdh/gprof2dot/tree/collapsed_format. I haven't given it much testing yet, but if anyone else finds it useful I could put up a PR.
I've wanted this forever, so I hacked it up over at richvdh/gprof2dot@
collapsed_format
. I haven't given it much testing yet, but if anyone else finds it useful I could put up a PR.
almost 4 years later, thank you for this - it helped me a lot recently : )
Given @richvdh did the work, code changes look alright, and many found this useful, I've merged it Thanks.