iceberg icon indicating copy to clipboard operation
iceberg copied to clipboard

API: add Histogram metric type

Open stevenzwu opened this issue 3 years ago • 1 comments

@rdblue here is the metrics PR. As we discussed, Flink Iceberg sink would need Gauge (e.g. last commit duration) and Histogram (e.g. file size distribution) metrics.

cc @danielcweeks @nastra for additional reviews

stevenzwu avatar Jul 24 '22 04:07 stevenzwu

Had a discussion with @rdblue.

Iceberg MetricContext should be used as a way for iceberg-core to expose metrics. Engine specific metrics (like Flink reader or writer) don't need to add an indirection and translate from Iceberg MetricContext to Flink metrics. There is not much benefit for the indirection.

For that reason, I will remove the Gauge metric from this PR. I will also send another PR for Flink FLIP-27 source, which currently translate from Iceberg MetricsContext to Flink metrics.

For simplicity, we will remove the reservoirSize param from MetricsContext#histogram(name) and hard-code the reservoirSize (probably to 1,024) for now. In the future, when we want to provide more flexibility, maybe we can add a new interface (like ObservationsTracker) that allows users to specify the data structure for tracking observations (e.g. FixedReservoirTracker, SketchTracker, etc.)

stevenzwu avatar Jul 29 '22 17:07 stevenzwu

Thanks, @stevenzwu! Looks great.

rdblue avatar Aug 19 '22 18:08 rdblue