Partial profile ingestion
A user may want to ingest subset of a pprof profile sample types and drop the rest.
One of the use cases is to "reduce cost" - regular go memory profiles are known to be big due to cumulatie alloc_* sample types nature. If we drop alloc values, the number of resulted samples would be drastically lower (usually with on disk size of hundreds of kilobytes instead of multiple megabytes). One of the alternative ways to reduce the size of the memory profile is to use godeltaprf https://github.com/grafana/pyroscope-go/tree/main/godeltaprof This was requested by one of Grafana Labs customer(Note: they said they want only inuse_space , and even no inuse_objects)
Another use case is to drop "corrupted" values: for example ingesting delta memory - we may want to drop delta-ed inuse_* values . and when ingesting non delta memory profile we may want to drop alloc_* values as they are cumulative . This has been drafted in the past. https://github.com/grafana/pyroscope/issues/3979 https://github.com/grafana/alloy/pull/2929
Dumping ideas how we could achieve per sample type values dropping:
- Add and expose configuration options to godeltaprof to allow selecting which sample types they want.
- Modify alloy to allow dropping alloc_ * values instead of computing deltas on them - does not work for godeltaprof.
- Modify alloy to allow dropping alloc_* values for godeltaprof profiles.
- Revive https://github.com/grafana/pyroscope/issues/3979 - modify alloy to include scaler configuration - allowing multiply some values by zero )
- introduce per tenant-service configuration to perform dropping on the server
- ? I secretly hope I missed something and it is possible to achieve sample type dropping somehow without code modification, but I could not find a way yet.
I agree with your assessment, that this is not possible today, I wonder if we should do this in two places:
- On collection side (either alloy or godeltaprof), which will take a while to land everywhere
- Server side (something that is available to everyone who upgrade Pyroscope)
I do think this use case is good candidate for relabelling rules, which could also be something that we expose to alloy's pyroscope.relabel.
On the server side the main problem is that we do not split out the different types early on: We wait until we reach ingester/segment writer to do that. (eg. see here for v2). If we would be doing this earlier (before we do the relabeling). We could simply drop them using a relabeling rules.
Very similar labels __unit__/__type__ could also be added in the alloy relabelling pipeline (at least for profiles in pprof format/using push.v1 style APIs).
Wdyt about that?
Very similar labels unit/type could also be added in the alloy rebeling pipeline (at least for profiles in pprof format/using push.v1 style APIs).
It would require to introduce decompress - unmarshall - marshal - compress procedure, right? If so, I would prefer if we solve this by not increasing load on alloy.
On the server side : while the early profile type split seems powerful. I think it makes sense to clarify some implementation details.
- Does it mean we split 1 memory profile into 4 profiles and then each of them have a copy of symbols? I am slightly worried we would quadruple segment writer load (unmarshaling profs, deduplicating symbols)
- Introduction of new labels would put them into different shards? I guess we can drop the labels after relabeling.
- Currently distributors send multiple profiles concurrently to segment writers, but in practice it is rarely used and usually it's just one profile inflight there. How do you think it would affect our latency if we introduce 4 requests there instead of 1?
Another Idea is to employ relabeling rules without actually splitting the pprof into multiples.
- Introduce an extra relabeling config , call it
sample_type_relabeling_rules - invoke the new relabeling N times where N is the number of profiling type, including profile type along the series labels as an input.
- if the result of the relabeling is
drop- set corresponding value tozero - compact profile - removes samples with
zerovalues.
Another Idea is to employ relabelling rules without actually splitting the pprof into multiples.
This is probably the most balanced approach. I think for the N times you could probably also provide the same labels as the ones added later __unit__/__type__ easily.
An explicit sample_type relabeling step / setting makes things very clear and it could be done before we do metering of the profile type.