promtool: add dump-series
Modify promtool tsdb dump samples code for print series data (ignore samples)
example:
promtool tsdb dump --format seriesjson | jq -r '.xxx'
...
Does this PR introduce a user-facing change?
[FEATURE] Promtool tsdb dump: add `--format seriesjson` option to output just series labels in JSON format.
Hello from the bug-scrub! Sorry your PR got no attention till now.
However I think the task got simpler -the dumpSamples function now takes a formatting function.
I think you can make a formatter which just prints the labels, i.e. based on these lines:
https://github.com/prometheus/prometheus/blob/6f1fd4be96e5096a604f44b9ad6b62c68fafac3d/cmd/promtool/tsdb.go#L756-L759
Thanks for reply :) the dump-series print series string and dedup the result (add a cache map)
Yes I’m proposing you write a function which prints the string, right after the lines I quoted. How can you get duplicate series?
Hi @bboreham the PR has new commits, and doc and testcase updated too :)
Just my 2 cents.
Note that if the json formatting isn't required, the series can also be fetched with sth like promtool tsdb dump | sed 's/} .*/}/' | sort -u
Maybe we can go further and add tsdb dump-json to dump samples in json (folks were asking for this in the past) and some scripting ~~like above~~ could be used to only get the series
Just my 2 cents.
Note that if the json formatting isn't required, the series can also be fetched with sth like
promtool tsdb dump | sed 's/} .*/}/' | sort -uMaybe we can go further and add
tsdb dump-jsonto dump samples in json (folks were asking for this in the past) and some scripting ~like above~ could be used to only get the series
Yes, JSON formatting is required (to facilitate subsequent filtering and selection using the jq tool, as well as for importing into DuckDB for SQL-based statistical sorting and analysis).
Thanks for continuing to work on the PR. I think it can be simplified.
Note also that several commits are missing 'DCO signoff', which we need to accept your contribution.
Thank you for your response. I have resubmitted the content of this PR and included a signature.
Maybe we can go further and add
tsdb dump-json
I don't really like the potential for the proliferation of dump-* options. I think we should do promtool tsdb dump --format=ndjson-labels (this should simplify the code too, as the options won't need to duplicated then).
It also leaves the door open to later have a ndjson or other option, which as @machine424 points out is a desire. (I think it makes sense to call this explicitly ndjson just incase one day it makes sense to output a whole JSON document, maybe based on an opentelemetry proto or similar.)
Hello from the bug scrub!
Seems like this comment by @dgl https://github.com/prometheus/prometheus/pull/13409#discussion_r1966963159 isn't addressed yet. Is this something you're coming back to @smallfish ? And there are some more comments as well.
I squashed the existing commits into one, then added more commits implementing the review suggestions.
@dgl I went with --format=seriesjson because I found ndjson too cryptic. I guess it meant "no data" ?