go-carbon Proposal discussion of a new implementation for tags

disclaimer: the same idea might have already been discussed or even done and my knowledge of the tag implementation in go-carbon might be outdated.

So while we started testing trie+dfa, we notice that on one of the cluster, the new index performs worse than trigram (±100ms -> ±200ms). And in a discussion with @azhiltsov , he revealed to me a replica of the same cluster but with better organized metric names. Because it turns out that people in our company are trying to filter out metrics by some meta data that's contained in the metric path by globbing, which isn't very performant. In the new replica, metric names contains more information and could serve queries better.

for example:

system.servers_app_id_0001.cpu.idle
->
system.app_id.cpu.idle.servers_app_id_0001

The discussion triggered an idea of having dynamic tag support. The current implementation that I remembered in go-carbon is that it would applied tag names as part of the filename, which is not very ideal, because in this way tags are like metric paths, they can't be changed without data loss or sysadmin toil. So in my proposal, tags would be saved independently.

It would use the same line protocol, tags could be added, updated, removed by sending it in to go-caron along with the metrics and data and need to be continuously included.

And in early implementation, I would suggest uniq values shouldn't be included in tags, they should remain in metric paths.

Another key difference to python implementation is that a new globbing syntax:

system.servers*.cpu.idle@{app_id: foo, dc: bar}

And the tag functions in frontend (carbonapi) could still be supported, but discouraged.

New syntax is both good and bad, the good part is metric globbing and data manipulation functions remains separated. the bad part is frontend changes might be required.

P.S. This is not the first time that I propose an extension of globbing syntax: https://github.com/go-graphite/go-whisper/pull/5

P.S. There were some tag effort from @Civil and many others, but it didn't materialize.

Tagging people who I think might be helpful or interested in this issue, sorry for the noise and welcome more people that I missed to comment or dismiss this idea. :) @azhiltsov @Civil @lomik @deniszh @dgryski

Oct 29 '19 08:10 bom-d-van

About the old effort on tags, I've actually have a sort-of-working version of go-carbon that was based on your old PR (and outcome of the hackaton from Early 2018) and it is even sort-of-working (I've tested that querying works, but I haven't extensively checked correctness of that): https://github.com/go-graphite/go-carbon

I haven't made it a PR because it still need more testing, especially around regex match operators. Plus it doesn't make a lot of sense to have only custom TagDB implementation, without supporting the whole range of upstream Tag DBs (even if they are worse in terms of performance), at least from my point of view.

Plus also please note that the Graphite tagging implementation actually have 2 modes:

Tags are in a path. It's straightforward.
Metric (with all tags appeneded) is hashed and only hash is stored on disk.

In both cases, index is separately updated by a POST HTTP request to /tagMultiSeries handle with all paths that should be added to the index.

Also the way how you query data actually makes it easier to modify the set of tags - with a proper query you will still have a solid line. The only drawback is that you will have a lot of different files on disk with few datapoints, but that could be less burden either with cwisper or if there'll be implementation of ceres. Also potentially it can be worked around by more API Handles that would allow to map different set of tags to same filename. But anyway, with the hashed metric names it would be hard to do globbing on a filesystem level, however it would be great to adapt trie+dfa index to do simple globbing on tag values (name becomes just one of the tags).

As about your proposal for syntax, if you'll decide to go that way, I'd suggest to consider Prometheus syntax, as it's known by other people and actually easy to convert to graphite one (I have a proof of concept carbonapi talking to Prometheus, which works just fine)

Oct 29 '19 09:10 Civil

Plus it doesn't make a lot of sense to have only custom TagDB implementation, without supporting the whole range of upstream Tag DBs (even if they are worse in terms of performance), at least from my point of view.

It could made compatible with the python version, but the implementation would focus on keeping tags dynamic and for a better index implementation (even though I'm not sure how good could it be).

Tags are in a path. It's straightforward.

Metric (with all tags appeneded) is hashed and only hash is stored on disk. In both cases, index is separately updated by a POST HTTP request to /tagMultiSeries handle with all paths that should be added to the index.

So yes, the strong point about this new implementation is that it changed neither filename nor metric path, because when tags are updated or removed, ~~history data is lost in both the solutions mentioned,~~ metrics remains the same.

Oct 29 '19 09:10 bom-d-van

So yes, the strong point about this new implementation is that it changed neither filename nor metric path, because when tags are updated or removed,

That sounds like a two parallel issues:

Make a graphite-web compatible tag implementation with all the handlers, etc (and then it could be extra point for go-carbon to be a replacement for upstream carbon)
Add another TagDB variant that will keep track of the tags.

The problem with having the same name is that the query logic will become more complex as you'll need to keep track of timestamp when the tag <-> metric mapping changed. As far as I remember, that was the main reason why during hackaton back in 2018 we went for new metric per each tag set.

Oct 29 '19 10:10 Civil

The problem with having the same name is that the query logic will become more complex as you'll need to keep track of timestamp when the tag <-> metric mapping changed. As far as I remember, that was the main reason why during hackaton back in 2018 we went for new metric per each tag set.

This is the python doc for tags:

When using a tagged naming scheme it is much easier to add or alter individual tags as needed. It is important however to be aware that changing the number of tags reported for a given metric or the value of a tag will create a new database file on disk, so tags should not be used for data that changes over the lifetime of a particular metric.

In the new design, this doesn't happen. When tags is added/changed/removed, no new metric files is generated, only tags related to the metric being updated.

I think this makes it much more flexible and users could add or remove tags without concern. As long metric path remains the same.

Oct 29 '19 11:10 bom-d-van

As usual, it's always about tradeoffs. Either you need to keep track on what changed when (I think this is what you are talking about now?), or live without history (carbonsearch approach) or have a new file whenever there are changes in the tags (as I've said, on a frontend you can create a query that will hide that fact from the end user - they will always get a single line, even if data is fetched from 10 different files).

Oct 29 '19 12:10 Civil