BlackLab
BlackLab copied to clipboard
Race condition in HitsGroupsTokenFrequencies
Intellij (rightfully) warns about non-atomic operation on volatile variable https://github.com/INL/BlackLab/blob/512719637ab986533ac9c9aaf7a575e7d3e0d586/engine/src/main/java/nl/inl/blacklab/search/results/HitGroupsTokenFrequencies.java#L299
Fix this by replacing the int and long with AtomicInteger and AtomicLong
It won't crash, but group sizes may too low because addition from one thread may be overwritten by subsequent addition in another thread.
Doesn't the fact that globalOccurrences is a ConcurrentHashMap mean this is actually safe? Not 100% sure, so using Atomic* is probably a good idea. Hopefully this doesn't slow things down too much.
Hmm, documentation of ConcurrentHashMap states "retrieval operations do not entail locking", so no, I don't think it prevents it sadly
But the documentation for compute() states "The entire method invocation is performed atomically.", and no other operations can occur at this time, right? (docIds.parallellStream() only calls compute() on globalOccurrences).
Again, seeing how hard this is to figure out for sure, going with Atomic* is probably the safest, or at the very least documenting why we think the code is safe as-is (i.e. the bottom line of this discussion, whatever that may turn out to be :-).