pg_prometheus
pg_prometheus copied to clipboard
Unique metrics
In production, we have processes uploading metrics via an API I created, which causes duplication of datas.
I did a workaround locally (using database migrations in the API), to alter a bit the structure:
- Adding a unique constraint :
ALTER TABLE metrics_values ADD CONSTRAINT unique_time_value_labels_id UNIQUE (time, value, labels_id);
- Adding an "ON CONFLICT" within the function
prometheus.insert_view_normal()
, at line 149 : https://github.com/timescale/pg_prometheus/blob/06b4c838a65ec096a236ba38f0046d03111dc1c7/sql/prometheus.sql#L149 Transforming this line as :
EXECUTE format('INSERT INTO %I (time, value, labels_id) VALUES (%L, %L, %L) ON CONFLICT DO NOTHING',
so the unique constraint allow to push metrics, and no duplicates to be found afterwards.
Does this qualify to be proposed as a PR ?
@Knudian Thank you for your proposal. If you want to submit it as PR, then you can prepare a proper PR from your fork. It will be important to add test for duplicate elimination. For now I will consider the issue as feature request and will check with the development team to consider it and your solution.
For consistency, it might be good to implement duplicate elimination for raw schema too.
I did the modifications on my fork (https://github.com/Knudian/pg_prometheus/tree/feature/unique-metrics).
For the tests ... I have to admit that I'm totally lost (I can make sql request, and counting results so it follows as expected by the unique constraints, but after that ?).
The test can be continuation of the existing tests. For example, it inserts new data with duplicates after the existing test and then selects data from the table and compares with the expected result, which should not contain duplicates. The insertion of new part can be at https://github.com/timescale/pg_prometheus/blob/master/test/sql/normalized.sql#L45 and https://github.com/timescale/pg_prometheus/blob/master/test/sql/raw.sql#L42 if I don't miss anything.
@spolcyn see above! Not sure if you'll have time to get to it in addition to the other things you're doing, but worth looking at.