go-cache icon indicating copy to clipboard operation
go-cache copied to clipboard

Feature request: max size and/or max objects

Open ncw opened this issue 11 years ago • 12 comments

To make go-cache as useful as memcache it needs an upper limit on the amount of memory that it can use otherwise it is easy to cache too many things and explode the memory usage.

Ideally it would have a memory limit like memcache does. It could be somewhat approximate as I'm sure it isn't particularly easy to account for memory used in the cache.

A max objects limit as well would probably be useful.

ncw avatar Nov 18 '12 14:11 ncw

I agree it would be useful, but as it stands it would be fairly difficult as you suggest. go-cache, unlike memcached, doesn't process/serialize most of the data, but rather keeps around a pointer to it. Of course, while this makes go-cache orders of magnitude faster, it also means that it isn't distributed in any way (like memcached is), and, unfortunately, that it can't know the size of what it's storing without some serious reflect/unsafe magic/overhead.

The best solution would probably be to make a new cache type that stores items that satisfy a Sizeable interface where foo.Size() represents its relative size as an integer. This is how Vitess' LRU cache does it: https://code.google.com/p/vitess/source/browse/go/cache/lru_cache.go This would transfer that responsibility to the user. I'm not totally sure if users can easily determine this in many cases, i.e. when they're storing something more complex than e.g. a struct with a []byte field or something, like a map of structs or slices.

Another solution would be to simply limit the number of items in the cache at any given time, but that assumes that every cache item is roughly the same size. The biggest problem with this is choosing the right items to expunge when necessary. Such a cache would probably end up becoming a clone of LRU (i.e. it keeps track of when items were last accessed, and uses something other than/in addition to a map for the storage) where every item has the same size, e.g. 1.

I will keep thinking about the best way to approach this in go-cache, but my gut feeling right now:

  • If you are exhausting memory, perhaps the expiration duration should be shorter?
  • If the expiration duration already is short, and memory usage still explodes, is it safe to assume that there are several frontend/application frontends for this system? In that case, perhaps using dedicated memcached boxes would be preferable? A great benefit here is that you can cache items for a long time, but still reliably expire items from a large cache cluster manually, even if many frontends are using the cache.

patrickmn avatar Nov 26 '12 03:11 patrickmn

Thanks for the detailed reply and the link to the Vitess cache.

I think you are right in that to do the sizing of objects would require the user to help. I was naively thinking that it could be introspected but you are right, the objects might be pointers to arbitrarily complicated structures which may or may not need to be accounted for in the cache. The user could have a go at estimating the size with a Sizeable interface as you suggested.

I still think a total objects limit would be useful. Let's say you are storing user sessions for 24 hours. However you get on the wrong end of a bot-net which makes a lot of sessions in a very short period of time. That is the kind of scenario that the memory usage explodes. Flushing user sessions wouldn't be ideal in this case but it would keep the server running rather than going out of memory.

Memcached could certainly do the job in this case, but I like the idea of doing it really efficiently with go-cache.

ncw avatar Nov 26 '12 12:11 ncw

I would like to note that I too believe a total number of objects limit would be useful.

htoothrot avatar Feb 27 '13 20:02 htoothrot

Sorry for the delay. I am still thinking about how to do this nicely, namely how to evict items the best way without implementing a copy of Vitess (and the overhead of maintaining an extra list of items/refreshing items on Get.) I agree that it's troublesome if somebody explodes your number of items.

patrickmn avatar Apr 09 '13 22:04 patrickmn

Doesn't Redis just randomly evict things from the cache? I think I care more about my cache being memory bounded then evicting "the wrong" element. Maybe we can make the actual algo for evicting pluggeable and start with random eviction was does not need fancy book keeping.

miekg avatar May 23 '16 13:05 miekg

I like the LRU approach best but with that the concern is overhead for read-heavy loads, and the memory usage of the Ctime/Atime for each cached item.

Random eviction could work, but I don't know if there's a reliable way to implement it, given that the underlying storage is map. Map traversal is unpredictable but not truly random, so you can't just traverse, pick the first key, and delete it: https://play.golang.org/p/lUw_1oPTLB But perhaps that is less of a concern for larger maps...

patrickmn avatar Apr 17 '17 22:04 patrickmn

Random eviction is fine for lazy caches but when your cache is read-intense, the objects are expensive to generate, and you end up evicting some hot items (and thus immediately cache-miss), it hurts.

cognusion avatar Apr 18 '17 12:04 cognusion

Interesting timing; this just hit HN: https://danluu.com/2choices-eviction/

We could conceivably have both options, but LRU still seems best given that we don't have a good way of selecting random items from the cache without maintaining some parallel data structure.

patrickmn avatar Apr 19 '17 13:04 patrickmn

I'm also going to vote for this feature. I'll still use this, but with major caution. A bot-attack would kill my service without the upper limit.

I was going to use this for caching user objects in relation to session keys so I don't have to hit my DB on every request. So expiring time is great for this. I'd be OK with randomly dropping entries if over size. It would be even better if it just dumped the oldest object, but I imagine there'd be a performance hit on that.

chris-pikul avatar May 12 '17 16:05 chris-pikul

good idea, Bug I think design limit max memory is better .

rfyiamcool avatar Nov 27 '17 07:11 rfyiamcool

This seems like a major missing feature. There are use cases where large eviction times totally make sense, but the cache must be memory bound, otherwise the service will eventually run out of memory and die. Any sort of limitation in that regard is much better than nothing at all IMO. I added this cache to our system with great ease, but will probably try to replace it with another one that supports this vital feature.

v-antech avatar Jul 07 '20 07:07 v-antech

I've started a PR #146

I'll write up some more benchmarking support for this, but it should be a pretty good way of limiting growth. Since it's a cache, invisibly failing writes is not that big of a deal.

aaomidi avatar Dec 13 '21 20:12 aaomidi