Eliminate false sharing possibility
The current implementation leaves non-zero possibility of false sharing between ConcurrentMapShared structs since they're allocated on heap in a loop in the New function. Hence, there is a possibility that some pairs of these structs get on the same cache line. In such situation (it's called false sharing), even a read from different shards would lead to contention (and CPU cache traffic) since RLock() does a write (atomic increment over an internal counter; lock xadd instruction in x86) from the memory perspective.
The simplest way to reproduce false sharing is to run the following benchmark:
func BenchmarkGetSame(b *testing.B) {
SHARD_COUNT = 2 // set num of shards to 2 for the simpler reproduction
m := New()
cdone := make(chan struct{}, 2)
m.Set("key1", "value")
m.Set("key2", "value")
b.ResetTimer()
go func() {
for i := 0; i < b.N; i++ {
// it's safer to use the returned value, but the compiler doesn't eliminate this code for me
m.Get("key1")
}
cdone <- struct{}{}
}()
go func() {
for i := 0; i < b.N; i++ {
m.Get("key2")
}
cdone <- struct{}{}
}()
<-cdone
<-cdone
}
Environment: Ubuntu 20.04, i5-8300h CPU, Go 1.16.2
Baseline:
BenchmarkGetSame-8 11753236 100.1 ns/op 0 B/op 0 allocs/op
Note: there is a small chance that the shards will get allocated with enough distance, so make sure to run the benchmark multiple times in order to see the worst case scenario. On my machine the worst case occurred most of the runs.
This PR:
BenchmarkGetSame-8 52440186 22.13 ns/op 0 B/op 0 allocs/op
Consider this PR to be an experiment aimed to demonstrate the false sharing effect and an approach to get rid of it.