pixie icon indicating copy to clipboard operation
pixie copied to clipboard

Scalable Long-Term Storage Support Without Custom Export Scripts

Open vyshakprojects opened this issue 9 months ago • 0 comments

Description: Currently, Pixie requires users to write explicit export scripts to send data to an OpenTelemetry (OTel) collector or any external backend. This model makes long-term storage difficult to scale, especially when we need access to different datasets on the fly or across multiple teams

Expected Behavior: Ideally, Pixie should provide a way to export all or a configurable subset of data automatically, similar to how Prometheus scrapes and exports metrics by default. This would allow users to store and query Pixie data in downstream systems without maintaining custom export scripts for every use case.

Current Behavior: Only data specified in a manually written export script is sent to the OTel collector. This limits observability unless users constantly update or create new export scripts, which adds operational overhead and reduces usability for broader adoption.


Use Case: As part of enabling scalable observability, I want to:

Persist most of the telemetry data Pixie collects for historical analysis.

Avoid writing/changing scripts every time a new data source is needed.

Integrate the data easily into tools like Grafana, with pre-built or auto-generated dashboards.


Possible Solutions/Ideas:

A configuration to auto-export a defined set of common tables (e.g., HTTP events, DNS, CPU metrics, etc.).

A toggle or policy to enable "firehose mode" for full data export.

Enhanced integration with OTel collector to support more plug-and-play pipelines.


Additional Context:

Happy to contribute ideas or test if something is already being planned in this area. Also open to discussing if this is better solved via a standardized Pixie script library maintained by the community.

vyshakprojects avatar Apr 08 '25 12:04 vyshakprojects