pprof icon indicating copy to clipboard operation
pprof copied to clipboard

FR: Input Folded Stacks Format

Open mhansen opened this issue 4 years ago • 3 comments

@felixge's pprofutils folded allows converting Folded Stacks format to pprof: https://github.com/felixge/pprofutils#folded

Folded Stacks are output by, and convertible from, a truly incredible number of profilers and profile data formats: see the Folded Stacks entry in Profilerpedia: https://docs.google.com/spreadsheets/d/1cVcHofphkQqk1yGeuBPVTit8HQ0oa5SlRM6gkHIagtw/edit#gid=0&range=I25. It seems it's such an easy format to output that many profilers use it directly, and the profilers that don't end up getting a converter so people can use Brendan Gregg's FlameGraph toolkit.

Bridging this gap between pprof and these profilers and these other data formats might allow pprof to be used to profile many more systems.

I'm thinking that if we have a folded stacks output format, then having a folded stacks input format would allow some powerful roundtripping use cases for ad-hoc analysis (things like using sed to rename stack frames).

I wonder if there's interest in supporting this in pprof core, or if you feel this should best live in other binaries, which output profile protos. But I could also argue that pprof has a great protobuf format, which allows external tools to handle this. Thoughts?

mhansen avatar Oct 09 '21 23:10 mhansen

#658 is a related feature request for supporting the folded stack format as an output format and while that one seems to make sense, I think we should stop at that and leave the "folded stacks" -> profile.proto conversion to external utilities.

Supporting a format as an input format is a stronger coupling than supporting it as an output format.

  • Need to support future format changes at least in some way (i.e. do not crash).
  • Need to validate the input well etc.
  • Parsing input is more code than formatting an output. Compare writing XML vs. parsing XML.

Supporting non-profile.proto output formats has a number of precedents in pprof so I think adding one more is fine. For input formats there is only profile.proto plus support for perf.data format via invoking an external converter and I'd prefer to leave it at that leaving profile.proto as the ~only input format for pprof. (oh, there is also a legacy text-based input format parser that is produced by https://github.com/gperftools/gperftools CPU profiler but that should go away one day I hope)

Of course, the folded stack format is simple and is unlikely to change etc., so some of the costs listed above are arguably not that high, but they are still there and the decision is long-term.

aalexand avatar Oct 10 '21 18:10 aalexand

@mhansen The folded stack format doesn't support indicating inlined frames or filenames, correct? In #649 you used <root_fn>(filename);<fun2>(filename);<leaf_fn>(filename) <value> format in a test but the (filename) part seems an "unofficial" extension?

aalexand avatar Oct 10 '21 19:10 aalexand

That's right. I'm not aware of any way to represent files consistently, my (filename) thing was just an extension for debugging. If we did implement folded stack importing, I'd probably argue we should only support stack frame names (function names) and nothing else, to make the parsing trivial to maintain.

The closest thing to a standard for inlined frames is adding _[i] as a suffix to inlined frames:

https://github.com/brendangregg/FlameGraph/blob/810687f180f3c4929b5d965f54817a5218c9d89b/stackcollapse-perf.pl#L377. But that's only used for coloring the stack frames when generating flamegraphs AFAIK. I'm not sure that's "standardised" enough to support in other parsing contexts.

mhansen avatar Oct 10 '21 21:10 mhansen

We don't have plans to support the folded format natively in pprof, closing. When needed, pprofutils should be used.

aalexand avatar May 02 '23 17:05 aalexand