V2: Support DurationNanos
Currently DurationNanos is not set during merge
https://github.com/simonswine/pyroscope/blob/9fdea3b017b7335acf39b36df81e3b6ac8f3de1b/pkg/pprof/merge.go#L160-L162
Just using the grafana/pyroscope link in case your fork is deleted: https://github.com/grafana/pyroscope/blob/9fdea3b017b7335acf39b36df81e3b6ac8f3de1b/pkg/pprof/merge.go#L160-L162
I don't think the problem is in the merge but the input profile has 0 duration somehow. I tried debugging but all I could find is that in v1 the duration is set here, and in V2 I couldn't find any place where it's set in the query_backend
I think you just found where the issue is :)
I think there is a bigger question about if duration should be determined like in v1 where it is basically end - start of the query parameters or if it should be driven by the actual data found.
Also if its driven by the data found, there multiple ways we could determine that.
I guess the simplest would be take min(start) and max(end) of the data and subtract them.
The DurationNanos field does not represent the time span over which profiles were collected. Instead, it indicates the time the profiler was actively collecting samples (i.e., the profile duration), and it is 0 in many cases.
References:
- https://github.com/google/pprof/blob/main/proto/profile.proto#L80
- https://github.com/google/pprof/blob/033d6d78b36af29b927e155f6d56cd62b9f949ac/profile/merge.go#L485
- https://github.com/grafana/pyroscope/blob/2c7e5971dc682cb442da4a5e234d01a2366a42c5/pkg/pprof/merge.go#L187
The proper solution is to read the actual values from the table and sum them up. For that reason, I've left the field blank for now, with the intention to implement this correctly in the future.
Leaving it blank is valid in some cases, whereas setting it to the query time range is never valid (from the pprof's perspective).
Also, we need DurationNanos (along with Period, as sample rate is not fixed in all profilers) at the query time if we want to express CPU time via CPU cores (rate).
The main challenge with DurationNanos is that, when a profile is split by sample labels, multiple rows can reference the same underlying value. If we sum these naively, we'll end up significantly overcounting.
To address this (and related issues), we need to reconstruct the original "source" profile from multiple rows. This can be done reliably using the profile ID field. While this is a bit tricky in v1, it's relatively straightforward in v2. This reconstruction is also essential for accurate aggregations (e.g., average, rate), visualizations like heatmaps, and for listing profiles.
However, with the current representation in the Parquet table, this process isn't very ergonomic – it likely requires loading a large portion of profile entries for a block/dataset into memory and many round-trips to object storage. That's manageable, given the relatively small number of profiles per block (thousands, tens of thousands).
Because Parquet's column-major access pattern doesn't align well with our use case – we need most of the column values of the row, it might be worth considering an alternative format or an alternative schema. Samples should still be stored separately from "profile entries" (or "profile headers").
One final note: the DurationNanos field is dropped during downsampling (along with some others), so any logic depending on it needs to be aware of that.