rdf4j
rdf4j copied to clipboard
Review thread safety of GroupIterator and AbstractCloseableIteration
The GroupIterator has a significant amount of code around threadsafety but is not thread safe to use.
I think most of it can be removed and the map/iteration code should attempt to be lazy.
For AbstractCloseableIteration we should not expect multiple threads to invoke hasNext/next methods as this is always unsafe. Also close() should be safe to call more than once therefore the whole threadsafe logic should not exist.
I've traced back the introduction of internal locking in GroupIterator to this commit by me, from 2012: https://bitbucket.org/openrdf/sesame/commits/03afc450148f01effee86bb4929a1bd477c4a41c . The actual issue was about lazy evaluation rather than concurrency, and I can't for the life of me figure out why I'd introduce that mutex at that point.
Given that removing this is not in any way affecting the public API or even the javadoc, I'm fine with treating this as a bug.
As for AbstractCloseableIteration, I haven't traced it back but I guess the same logic holds. The current incarnation of the code was actually a performance improvement done by @hmottestad to get rid of a (slow) AtomicBoolean, but I can't remember when/why the original mutex was put in place.
I think at the time the thinking may simply have been "better safe than quick" :)