prometheus icon indicating copy to clipboard operation
prometheus copied to clipboard

promtool: add dump-series

Open smallfish opened this issue 1 year ago • 10 comments

Modify promtool tsdb dump samples code for print series data (ignore samples)

example:

promtool tsdb dump --format seriesjson | jq -r '.xxx'
...

Does this PR introduce a user-facing change?

[FEATURE] Promtool tsdb dump: add `--format seriesjson` option to output just series labels in JSON format.

smallfish avatar Jan 16 '24 12:01 smallfish

Hello from the bug-scrub! Sorry your PR got no attention till now.

However I think the task got simpler -the dumpSamples function now takes a formatting function. I think you can make a formatter which just prints the labels, i.e. based on these lines: https://github.com/prometheus/prometheus/blob/6f1fd4be96e5096a604f44b9ad6b62c68fafac3d/cmd/promtool/tsdb.go#L756-L759

bboreham avatar Aug 13 '24 11:08 bboreham

Thanks for reply :) the dump-series print series string and dedup the result (add a cache map)

smallfish avatar Aug 13 '24 11:08 smallfish

Yes I’m proposing you write a function which prints the string, right after the lines I quoted. How can you get duplicate series?

bboreham avatar Aug 13 '24 12:08 bboreham

Hi @bboreham the PR has new commits, and doc and testcase updated too :)

smallfish avatar Aug 15 '24 09:08 smallfish

Just my 2 cents.

Note that if the json formatting isn't required, the series can also be fetched with sth like promtool tsdb dump | sed 's/} .*/}/' | sort -u

Maybe we can go further and add tsdb dump-json to dump samples in json (folks were asking for this in the past) and some scripting ~~like above~~ could be used to only get the series

machine424 avatar Feb 19 '25 15:02 machine424

Just my 2 cents.

Note that if the json formatting isn't required, the series can also be fetched with sth like promtool tsdb dump | sed 's/} .*/}/' | sort -u

Maybe we can go further and add tsdb dump-json to dump samples in json (folks were asking for this in the past) and some scripting ~like above~ could be used to only get the series

Yes, JSON formatting is required (to facilitate subsequent filtering and selection using the jq tool, as well as for importing into DuckDB for SQL-based statistical sorting and analysis).

smallfish avatar Feb 21 '25 06:02 smallfish

Thanks for continuing to work on the PR. I think it can be simplified.

Note also that several commits are missing 'DCO signoff', which we need to accept your contribution.

Thank you for your response. I have resubmitted the content of this PR and included a signature.

smallfish avatar Feb 21 '25 06:02 smallfish

Maybe we can go further and add tsdb dump-json

I don't really like the potential for the proliferation of dump-* options. I think we should do promtool tsdb dump --format=ndjson-labels (this should simplify the code too, as the options won't need to duplicated then).

It also leaves the door open to later have a ndjson or other option, which as @machine424 points out is a desire. (I think it makes sense to call this explicitly ndjson just incase one day it makes sense to output a whole JSON document, maybe based on an opentelemetry proto or similar.)

dgl avatar Feb 24 '25 01:02 dgl

Hello from the bug scrub!

Seems like this comment by @dgl https://github.com/prometheus/prometheus/pull/13409#discussion_r1966963159 isn't addressed yet. Is this something you're coming back to @smallfish ? And there are some more comments as well.

krajorama avatar Aug 05 '25 11:08 krajorama

I squashed the existing commits into one, then added more commits implementing the review suggestions. @dgl I went with --format=seriesjson because I found ndjson too cryptic. I guess it meant "no data" ?

bboreham avatar Nov 10 '25 18:11 bboreham