ristretto
ristretto copied to clipboard
Fixes closing of Cache and Policy to be concurrent safe
Before this change, calling Close on the cache while a Set call is in progress could occasionally result in a panic when the Set method tried to write to the (now closed) setBuf
channel. This commit changes Cache and Policy to keep references to the written-to channels in an internals
struct, which is nil-ed out on close and checked in each call site, ensuring that nothing is attempting to write to a closed channel.
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.
One open question: Should we still try to close out the setBuf
channel post Close
, perhaps in a goroutine, for full cleanup?
I can't see the TeamCity build results (no access), but go test ./...
passes locally, for reference.
Thanks for the PR @josephschorr. CI is failing with a data race.
==================
WARNING: DATA RACE
Read at 0x00c0001df0d0 by goroutine 90:
github.com/dgraph-io/ristretto.(*Cache).setInternal()
/home/agent/work/7f32719a4b7aa00/cache.go:296 +0x5d
github.com/dgraph-io/ristretto.(*Cache).SetWithTTL()
/home/agent/work/7f32719a4b7aa00/cache.go:281 +0x18f
github.com/dgraph-io/ristretto.(*Cache).Set()
/home/agent/work/7f32719a4b7aa00/cache.go:273 +0x13e
github.com/dgraph-io/ristretto.TestConcurrentSetAfterClose.func1()
/home/agent/work/7f32719a4b7aa00/cache_test.go:240 +0x128
Previous write at 0x00c0001df0d0 by goroutine 79:
github.com/dgraph-io/ristretto.(*Cache).Close()
/home/agent/work/7f32719a4b7aa00/cache.go:427 +0xcf
github.com/dgraph-io/ristretto.TestConcurrentSetAfterClose()
/home/agent/work/7f32719a4b7aa00/cache_test.go:260 +0x2d1
testing.tRunner()
/usr/local/go/src/testing/testing.go:1050 +0x1eb
Goroutine 90 (running) created at:
github.com/dgraph-io/ristretto.TestConcurrentSetAfterClose()
/home/agent/work/7f32719a4b7aa00/cache_test.go:233 +0x20a
testing.tRunner()
/usr/local/go/src/testing/testing.go:1050 +0x1eb
Goroutine 79 (running) created at:
testing.(*T).Run()
/usr/local/go/src/testing/testing.go:1095 +0x537
testing.runTests.func1()
/usr/local/go/src/testing/testing.go:1339 +0xa6
testing.tRunner()
/usr/local/go/src/testing/testing.go:1050 +0x1eb
testing.runTests()
/usr/local/go/src/testing/testing.go:1337 +0x594
testing.(*M).Run()
/usr/local/go/src/testing/testing.go:1252 +0x2ff
main.main()
_testmain.go:200 +0x223
==================
--- FAIL: TestConcurrentSetAfterClose (0.20s)
testing.go:965: race detected during execution of test
FAIL
FAIL github.com/dgraph-io/ristretto 64.962s
? github.com/dgraph-io/ristretto/contrib/demo [no test files]
? github.com/dgraph-io/ristretto/contrib/memtest [no test files]
ok github.com/dgraph-io/ristretto/sim 0.032s
ok github.com/dgraph-io/ristretto/z 95.954s
ok github.com/dgraph-io/ristretto/z/simd 0.029s
FAIL
Try running the tests with the -race
flag (that's what ./test.sh
runs).
go test -race ./...
D'oh... I forgot the race detector doesn't run automatically. Will address now.
@danielmai So looking more deeply, the race is more or less the "expected behavior" on that variable, which is why I keep a local copy. I can easily fix this by adding a mutex guard around accessing it, but that does mean that there will be some slight slowdown on all the function calls.
Thoughts?
Edit: Decided to add it for safety
I'll defer to @manishrjain to chime in for this PR.
@manishrjain Thoughts on the use of a lock vs, say, a compare and swap?
In terms of timing, we're trying to decide whether to have SpiceDB make use of my branch or wait for this to merge and release, so guidance in that area would also be appreciated.
Hey @josephschorr , for now, my recommendation would be to use your fork. Meanwhile, I'll try to find some time to review this PR and understand it's performance implications.
Any updates regarding this PR ?
Friendly ping @josephschorr @manishrjain
I'm no longer managing this repo. I have my own fork of ristretto that I manage and use.
https://github.com/outcaste-io/ristretto
@josephschorr would you please accept the CLA and resolve the conflict. @joshua-goldstein, @pandeyshubham25, @matthewmcneely could one of you review the code and help with resolving the conflict. Also, let's make sure that the version is updated appropriately.
@akon-dey I'd be happy to rebase my changes, but be aware we've moved to the github.com/outcaste-io/ristretto fork, which fixed the issue as well. This is partially why we stopped asking for updates on our fork containing this branch.
I'll see if I can get this rebased this evening
@josephschorr I was wondering what's the reasoning for using the fork, is not like this repo was abandoned?
@gaby This PR sat for quite a while, so we used our own fork during that time. Once github.com/outcaste-io/ristretto became available, we tested it for the fix, found it worked, and so moved to it rather than relying on a custom fork that would not get updates otherwise.
@gaby This PR sat for quite a while, so we used our own fork during that time. Once github.com/outcaste-io/ristretto became available, we tested it for the fix, found it worked, and so moved to it rather than relying on a custom fork that would not get updates otherwise.
Makes sense, thanks for the feedback!