carbonapi icon indicating copy to clipboard operation
carbonapi copied to clipboard

Cache more aggressively

Open dgryski opened this issue 11 years ago • 10 comments

Graphite munges timestamps to minutely resolution to increase the chance of a cache hit. It also stores the data returned by the stores so future queries on the same data can also be pulled from the cache.

Decide which of these we want to implement.

dgryski avatar Sep 27 '14 07:09 dgryski

@dgryski why not make cacheTimeout := int32(60) duration user configurable?

Graphite-web implements DEFAULT_CACHE_DURATION which is very useful in my use case as we collect metrics at an interval of 10s. When using carbonapi I have to manually override the Cache Timeout value in Grafana, or disable caching altogether - both of which are either impractical or undesirable.

xneo64 avatar Nov 12 '16 11:11 xneo64

@xneo64 Good plan. I'll do that on Monday when I'm back at work. (If you have a PR, I can review and merge that sooner.)

dgryski avatar Nov 12 '16 11:11 dgryski

Just got to know about this project a couple of days ago, and converted my preproduction environments. While very easy to read, unfortunately I'm inexperienced with Go code, so I'll do you a favour by not submitting a barbaric PR.

I'll definitely be able to make proper + tested contributions in the future

xneo64 avatar Nov 12 '16 11:11 xneo64

The munged timestamps to increase cache hit ratio also an approach that should be considered. Would be interesting to see what effect it has on real traffic. We can probably simulate this given logs.

dgryski avatar Apr 26 '17 10:04 dgryski

@dgryski @Civil are there any realistic near-term plans on implementing the timestamp munging? If you consider handling requests from Grafana, right now caching only has a chance to go in effect if two or more identical queries arive within a single second, right? To me it looks like operating with full-minute timestamps would have a night-and-day performance effect for many production set ups. I am troubleshooting one right now, and the bulk of carbonapi requests are popular Grafana dashboards which repeatedly issue the same queries with request time ranges like [-24h,now] etc., which unfortunately do not get cached because the timestamps are resolved differently even if sent 5s or 10s apart.

jaroslawr avatar Jun 09 '20 11:06 jaroslawr

I'm open to a PR or at least to some one who are willing to test such a feature. As the overall problem here is that neither me nor Damian have production environments that uses carbonapi (at least to my knowledge), so it is very hard to test what benefit it would have in real life and if it won't add any new bugs (bugs within caching could be tricky to diagnose).

Civil avatar Jun 09 '20 13:06 Civil

Actually there was another idea on how to improve loading time for dashboards and hit ratio as well: https://github.com/go-graphite/carbonapi/issues/242

But those approaches can co-exist.

Civil avatar Jun 09 '20 13:06 Civil

@Civil Turns out the problems we experience have a different reason, Grafana will send different maxDataPoints for the same graphs on the same dashboard depending on the screen width (a difference of a few pixels is sufficient for a different value) and carbonapi uses maxDataPoints as part of the cache key, which makes sense as it caches the response already after trimming to maxDataPoints.

Do you think it would be OK to cache the results from the backends rather than this final response? This would allow to reuse the cache for different request formats and different maxDataPoints values, as long as the requested targets and time range is the same, at the expense of having to do the consolidation and marshalling even for requests served from cache. I can try to submit a PR, but I would like to work out something that fixes the issue but also has a chance of being merged upstream.

jaroslawr avatar Jun 10 '20 08:06 jaroslawr

@Civil (Or maybe rather then replacing one with the other a possibility to configure a second cache for the backend results would be more acceptable as a PR?)

jaroslawr avatar Jun 10 '20 09:06 jaroslawr

In that case I would actually ensure that result is cached after all the math is done but before maxDataPoints is applied.

As currently maxDataPoints is applied to the result after all the functions are evaluated: https://github.com/go-graphite/carbonapi/blob/ff14e6bd9807da234917017ac817b84dd6dfefdb/cmd/carbonapi/http/render_handler.go#L296-L300

Even if the maxDataPoints will be taken into account by backend (since recently it's got passed to backend, as far as I remember) if you would cache result for target + start + end combination that would save you all the round-trips and math around.

Civil avatar Jun 10 '20 15:06 Civil