An Unobtrusive LRU for the best time cache I've used for go
I needed a time AND LRU cache. By far, this is the best time cache I've found for go. As such, I've injected LRU capabilities that simply:
- ensures that when an item is added, if the number of unexpired items >= maximum number of items specified, the oldest (based on ACCESS) is expired
- when an item is "gotten", or "set", the access time is updated
The existing janitor system is unchanged, and used to actually delete the items.
The only change from a consumer standpoint, is one additional parameter at the end of a "New" call, that specifies the maximum number of items to allow, or 0, if you want to disable the LRU completely.
I updated the "test" file so that it would run as-is with LRU disabled. I used my own test harness that isn't horribly portable. I'll work on updating yours with LRU tests in my Copious Free Time (patent pending) if you're interested in accepting this pull.
This is 1/2 of what #5 is asking for.
Thanks for the pull request.
This is a big change, and my main concern is that it breaks API compatibility. If this is to be merged, it needs to be changed so that the LRU functionality does not alter any existing methods.
The best option might be to create a completely new package that offers both. I'll think about it a bit.
Thanks for getting back to me.
If a separate "New" function, ala "NewWithLRU" is used instead, there would be no API change. If that's sufficient, I'll update the pull, if not I'll maintain the fork.
If you're interested, I significantly changed how the LRU is handled. The janitor routine deals with everything asynchronously now, reducing list walks and whatnot, and significantly improving performance when the cache is very large and under pressure. Even with extremely large caches there is no measurable performance impact vs. the pre-LRU code.
Very interesting, thanks. Yeah, a new New func would be sufficient as long as it doesn't meaningfully impact anything else.
I'll do a little testing as well this week, but I think it makes sense to merge this.
.New() has been reverted back to its previous 2-argument condition, .NewWithLRU() has been added with the 3-argument set. I also made some suggested README changes in a separate, subsequent commit. Feel free to ignore or edit it as you see fit.
Anything going on here?
I know it's been a long time. I haven't forgotten this, I'm just concerned about the overhead of Atime/Ctime when you are caching many small items and are not using the LRU functionality -- but I can't think of a good solution that doesn't involve copy-pasting almost everything.
It's really hard to gauge what the typical use case is for go-cache, so it's hard to know if this will explode memory usage for no reason or add a useful feature at negligible cost.
Beside this, just a few minor things (Matt, not asking you to do this, I can change this if/when it gets merged):
- Are Ctime/Created actually necessary?
- The NewFrom change technically breaks the API. It should probably be NewFromWithLRU (even if it's a bit silly)
@patrickmn I'll re-review my submissions (while I've used it a ton, I haven't looked at this code in an eon) fix the NewFrom, and see what I can come up with for lightening the object size for non-LRU users (who wouldn't want an LRU?! :) )
@patrickmn New commit removes ctime, unbreaks NewFrom(), etc. I do not believe the extra empty "Atime" var adds any extra memory-per-cached-object for non-LRU caches. I cannot measure a difference, at least.
Could you give me write access to your repo? I'd like to push a few changes to bring this up to current master. There's been some significant, but internal, changes recently.
(I can create another repo, too, just think I have to open another PR then.)
Sent an invite to the repo. I can elevate you thereafter. @patrickmn
Thanks @cognusion, accepted.
Ok, I pushed a few commits with some changes, chiefly:
- Changed time checks to use int64 (unixnano) since that's what the cache uses now
- Inlined most expensive operations, only performed time.Now() once, etc. (This makes the code look pretty silly, but is unfortunately required to keep op overhead in the tens-of-nanoseconds range.)
- Changed DeleteLRU to DeleteLRUAmount, and made DeleteLRU not require an argument (just calls DeleteLRUAmount with the overage)
- Added a note that LRU functionality makes Get a slower, write-locked operation
- Added a test
- Some minor style changes
Benchmarks seem OK:
-
Before LRU changes: BenchmarkCacheGetExpiring-4 - 57.1 ns/op BenchmarkCacheGetNotExpiring-4 - 21.2 ns/op
-
After LRU changes: BenchmarkCacheGetExpiring-4 - 58.8 ns/op BenchmarkCacheGetNotExpiring-4 - 21.9 ns/op BenchmarkCacheWithLRUGetExpiring-4 - 136 ns/op BenchmarkCacheWithLRUGetNotExpiring-4 - 101 ns/op
The non-LRU cache has virtually identical performance, while the LRU-enabled cache has slower read operations, as expected.
Do these changes look good to you?
I am a little concerned that the unit test isn't really testing the LRU properly. Should probably have a real one...
Looks good, @patrickmn . I'll get back on porting my LRU tests to plain go test. I've been made lazy with Convey :)
@patrickmn actually it looks like the LRU is broken, so don't merge. I haven't unwound the problem yet. I just saw you made some more commits. I haven't seen them yet
No worries.
Is the issue that the LRU cleanup doesn't evict the oldest items, just some old items (depending on scan/ring buffer fill order)?
@cognusion in which ways is the LRU broken?
So still no LRU support?
this feature would be really helpful.
In short, if you need a true LRU this isn't for you. For large (quantity) caches, it will purge "some of of the least-recently used items" but not necessarily the least-recently used items.
For my use with large caches this is perfectly fine, although I refer to it as an "Eventually Least-Recently Used Cache" in jest. I don't plan on working further on it, so whether @patrickmn wants to merge it or not is his call. It does work very very well for large caches where it's ok that the LRU math isn't spot on. I've looked into a couple different ways of making this more correct, but they result in less read performance, and longer pauses for purges: Neither of which I've been willing to saddle on this feature.
Closing this PR. I/we moved away from this a few years ago.