YangJie
YangJie
> nit: Btw, from a completeness point of view, DB.get makes sense to include :-) OK ~
Also cc @tgravescs do you have time to help review this pr? Thanks ~
Thanks @mridulm @tgravescs @zhouyejoe @dongjoon-hyun !!! I will do the RocksDB related work next week due to some other work needs to be completed this week ~
The exceptions reported in SPARK-39696 are as follows: ``` 2022-06-21 18:17:49.289Z ERROR [executor-heartbeater] org.apache.spark.util.Utils - Uncaught exception in thread executor-heartbeater java.util.ConcurrentModificationException: mutation occurred during iteration at scala.collection.mutable.MutationTracker$.checkMutations(MutationTracker.scala:43) ~[scala-library-2.13.8.jar:?] at scala.collection.mutable.CheckedIndexedSeqView$CheckedIterator.hasNext(CheckedIndexedSeqView.scala:47)...
> Have not looked in detail, but there are a bunch of other places where `externalAccums` is directly used from - are they also susceptible to these issues ? If...
> > It seems to be a small probability event @smcmullan-ncirl I wonder if there will be such a high frequency of failures when using Scala 2.12?
Yes. When running new test suite, `ConcurrentModificationException` only occurs when using Scala 2.13. `IndexOutOfBoundsException` or `NPE` may occur when using Scala 2.12, but I did not encounter it in the...
@mridulm Compared with analyzing each scenario and using read-write locks, I think it may be simpler to change `externalAccums` to use a thread-safe data structure, for example `CopyOnWriteArrayList`. Do you...
@JoshRosen Yes, your analysis is very accurate. From the current stack, I can only infer that the following two methods may have racing (but I haven't found any conclusive evidence),...
> Thanks for the detailed analysis @JoshRosen - I agree with your analysis. I saw two cases where this could be happening - > > * test code or user...