scala-parallel-collections
scala-parallel-collections copied to clipboard
Starvation situations with fixed thread pools
Parallel collections that use a fixed thread pool for task support and contain "too many" elements will deadlock, where "too many" is a value I haven't been able to qualify.
The simplest possible reproduction is the empty list with a thread pool of 1:
import scala.concurrent.ExecutionContext
import java.util.concurrent.Executors
import scala.collection.parallel.CollectionConverters._
import scala.collection.parallel.ExecutionContextTaskSupport
val col = List.empty[Int].par
col.tasksupport = new ExecutionContextTaskSupport(
ExecutionContext.fromExecutorService(Executors.newFixedThreadPool(1))
)
// This will deadlock
col.map(_ + 1)
I have observed the same starvation issue with larger thread pools and lists, but not reliably enough to provide a reproduction case.
@axel22 should https://docs.scala-lang.org/overviews/parallel-collections/configuration.html say something about this?
Maybe we should add a recommendation to use only ForkJoinPools to that documentation.
Isn’t that hiding the symptoms though? Sure, it’s best to use a ForkJoinPool, but we still have perfectly legal code that deadlocks
This seems to stem from the implementation of
scala.collection.parallel.FutureTasks
vs the implementation of scala.collection.parallel.ParIterableLike.ResultMapping
The issue is that
-
The FutureTasks logic first determines the parallelism based on the number of cores on the machine (in my case this was 6).
-
It then creates a computation tree based on the max depth calculated by this:
private val maxdepth = (math.log(parallelismLevel) / math.log(2) + 1).toInt
-
This will create up to 2^maxDepth Futures (based on the number of elements in the collection) Each of this futures calls
scala.collection.parallel.Task#tryLeaf
-
The
tryLeaf
method eventually calls through toscala.collection.parallel.Task#leaf
. The implementation of leaf for ResultMappingscala.collection.parallel.ParIterableLike.ResultMapping#leaf
makes a blocking call totasksupport.executeAndWaitResult(inner)
-
In the case of a single thread, this means that the thread that is waiting for the task result is blocked because there is no thread available to process the
scala.collection.parallel.ParIterableLike.StrictSplitterCheckTask
In theory, if the number of threads you define for your executor is:
(2 ^ (log(parallelismLevel) / log(2) + 1).toInt) + 1
then it shouldn't get deadlocked.... i think
Stack trace from an experiment.
java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
scala.concurrent.impl.Promise$DefaultPromise.tryAwait(Promise.scala:242)
scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:258)
scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:263)
scala.concurrent.Await$.$anonfun$result$1(package.scala:219)
scala.concurrent.Await$$$Lambda$1274/950805155.apply(Unknown Source)
scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:57)
scala.concurrent.Await$.result(package.scala:146)
scala.collection.parallel.FutureTasks.$anonfun$execute$3(Tasks.scala:513)
scala.collection.parallel.FutureTasks$$Lambda$1256/656809460.apply(Unknown Source)
scala.collection.parallel.FutureTasks.executeAndWaitResult(Tasks.scala:519)
scala.collection.parallel.ExecutionContextTasks.executeAndWaitResult(Tasks.scala:555)
scala.collection.parallel.ExecutionContextTasks.executeAndWaitResult$(Tasks.scala:555)
scala.collection.parallel.ExecutionContextTaskSupport.executeAndWaitResult(TaskSupport.scala:84)
scala.collection.parallel.ParIterableLike$ResultMapping.leaf(ParIterableLike.scala:960)
scala.collection.parallel.Task.$anonfun$tryLeaf$1(Tasks.scala:53)
scala.collection.parallel.Task$$Lambda$1257/1697135480.apply$mcV$sp(Unknown Source)
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
scala.util.control.Breaks$$anon$1.catchBreak(Breaks.scala:67)
scala.collection.parallel.Task.tryLeaf(Tasks.scala:56)
scala.collection.parallel.Task.tryLeaf$(Tasks.scala:50)
scala.collection.parallel.ParIterableLike$ResultMapping.tryLeaf(ParIterableLike.scala:955)
scala.collection.parallel.FutureTasks.$anonfun$exec$5(Tasks.scala:499)
scala.collection.parallel.FutureTasks$$Lambda$1253/1941666034.apply(Unknown Source)
scala.concurrent.Future$.$anonfun$apply$1(Future.scala:658)
scala.collection.parallel.FutureTasks$$Lambda$1254/1629179184.apply(Unknown Source)
scala.util.Success.$anonfun$map$1(Try.scala:255)
scala.util.Success.map(Try.scala:213)
scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
scala.concurrent.Future$$Lambda$1205/1115425820.apply(Unknown Source)
scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
scala.concurrent.impl.Promise$$Lambda$1206/183285557.apply(Unknown Source)
scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
I've also encountered this problem. Perhaps the following observation will provide a hint to someone: in Scala 2.12 applying a map to some parallel vector worked in the REPL but consistently failed when the same command was inside an object defined in a script. The same map seems to consistently fail in 2.13 regardless of whether I try to run it in the REPL or the script.
@javax-swing, @axel22 for me it renders the whole parallel collections module useless, so I'm quite surprised this thread seems inactive for over a year - am I missing something? is there some secret workaround? I'm really looking forward to using parallel collections in my code.
@amitainz - You could try to transition to using Java 8 Streams. Although they don't support every possible collection, they generally perform quite well, and there are adapters for the major Scala collections. See https://www.scala-lang.org/api/current/scala/jdk/StreamConverters$.html
@amitainz - You could try to transition to using Java 8 Streams. Although they don't support every possible collection, they generally perform quite well, and there are adapters for the major Scala collections. See https://www.scala-lang.org/api/current/scala/jdk/StreamConverters$.html
Thanks @Ichoran , I'll take a look.
At least in Scala 2.12.10, I managed to make things work by making my object extend the App trait. So that when I run: Snippets.main(Array[String]())
the code runs as expected. Note if I try to access the ParVector instance inside Snippets
and apply map to it on the sbt REPL, it hangs. Still, I thought this might be useful for other users / provide some clue for any developer who wants to fix this bug.
I as well to my chagrin have discovered that previously working code that worked famously in 2.10 and 2.11 for almost a decade hangs in scala 2.12.
Should we completely abandon our code that uses parallel collections in our effort to get on 2.12? And re-write it to use simple futures? Seems like yes
(You might ask "why still on 2.12 in 2020?" Yes, well - it's issues like this - causing mysterious deadlocks that take down our application - that put our 2.12 upgrade effort on hold for months after we burn through our timeboxed effort to upgrade)
@noahlz I'm a newbie to Scala, so probably should keep out of such strategic discussions. But for what it's worth, the parallel collections were the number one reason I decided to give scala a try. I mean, the language is elegant and expressive, sure - but I was looking for something that will actually save me worrying about the parallelization. The second reason was that it is reasonably fast (I got tired of looking for inventive numpy hacks to do what I want in python), but there are plenty of other languages in that category.
Do we actually know whether @nrinaudo's original report is 2.12+-only?
I don't know for sure, but I suspect this ticket has become a grab bag of "parallel collections didn't work for me, in my code" reports that may or may not have anything to do with each other.
In particular, I suspect overlap with https://github.com/scala/bug/issues/8119
@SethTisue - it certainly isn't just a problem in 2.12. I don't remember all the scala versions I've tried, but 2.13 definitely has this / a very similar problem, as well as some older versions. You're probably right that a thorough check of all the versions and mapping of this bug is called for, but it does seem that all these issues stem from the same type of deadlocking behavior.
I wouldn't categorize them as "parallel collections didn't work for me in my code". It's a very specific deadlocking issue that's very easy to reproduce, occurs in many versions and seems to be "well known" in the scala community. I don't think it makes sense to close the only (right?) active thread addressing this issue (I see the bug report on 2.13 that you referenced above was closed).
it certainly isn't just a problem in 2.12. I don't remember all the scala versions I've tried, but 2.13 definitely has this / a very similar problem,
I know. That's why I said "2.12+", not just "2.12". There is a crucial dividing line between 2.11 and 2.12, because the lambda encoding changed. (There is also an important dividing line between 2.12 and 2.13, namely the collections overhaul, and the relocation of the parallel collections to an external module.)
I don't think it makes sense to close the only (right?) active thread addressing this issue (I see the bug report on 2.13 that you referenced above was closed).
. It's a very specific deadlocking issue that's very easy to reproduce, occurs in many versions and seems to be "well known" in the scala community. I don't think it makes sense to close the only (right?) active thread addressing this issue (I see the bug report on 2.13 that you referenced above was closed).
The issue you seem to be referring to is scala/bug#8119 and there doesn't need to be more than one ticket on it.
That ticket is closed because it's already fixed for 2.13.2, which may ship as soon as this week. In the meantime, you can try out the fix in a nightly: https://stackoverflow.com/questions/40622878/how-do-i-tell-sbt-to-use-a-nightly-build-of-scala-2-12-or-2-13
This ticket — @nrinaudo's original report, above — is about a different problem.
It is important for all of us to be maximally precise about exactly which problems exist in exactly which Scala versions and how exactly those problems can be reproduced.
👍 this is not a case of "parallel collections didn't work for me in my code" see also scala/bug#8955 this has come up before and seems not fixed.
Example code
$ console
[info] Starting scala interpreter...
[info]
Welcome to Scala 2.12.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_191).
Type in expressions for evaluation. Or try :help.
scala> (1 to 1000).toList.par.foreach(i=> println(Thread.currentThread.getName + s" = $i"))
hangs
In 2.11
$ +console
Welcome to Scala 2.11.12 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_191).
Type in expressions for evaluation. Or try :help.
scala> (1 to 1000).toList.par.foreach(i=> println(Thread.currentThread.getName + s" = $i"))
ForkJoinPool-1-worker-7 = 251
ForkJoinPool-1-worker-13 = 1
ForkJoinPool-1-worker-13 = 2
ForkJoinPool-1-worker-13 = 3
ForkJoinPool-1-worker-13 = 4
... etc
Like I said, I'm usually met with disbelief when I say our team is still on 2.11 but it's things like this hold us up, a real shame. Fortunately, we just have to rip out parallel collections from our code and replace with a Thread Pool Executor execution context + Futures.
@noahlz https://github.com/scala/bug/issues/8955 was fixed a long, long time ago. if you're going to bring that up, you need to provide specific evidence that it isn't fixed
the transcript you've included demonstrates a different bug, namely https://github.com/scala/bug/issues/8119. if you try a 2.13 nightly, as I suggested above, you'll find that it is fixed there, for the upcoming 2.13.2 release
none of this has anything to do with this ticket, Nicolas's bug report about thread pools — at least, I don't think it does and nobody has presented evidence that it does
@SethTisue thank you for clarifing! 👍
Ok, I didn't realize that bug was fixed - and thanks for the nightly tip. (for now I have some setup that works as long as I don't try to work with parallel collections in the REPL, so I'll just wait for the official rollout).
On Sun, Apr 19, 2020 at 7:33 PM Seth Tisue [email protected] wrote:
it certainly isn't just a problem in 2.12. I don't remember all the scala versions I've tried, but 2.13 definitely has this / a very similar problem,
That's why I said "2.12+". There is a crucial dividing line between 2.11 and 2.12, because the lambda encoding changed.
I don't think it makes sense to close the only (right?) active thread addressing this issue (I see the bug report on 2.13 that you referenced above was closed).
. It's a very specific deadlocking issue that's very easy to reproduce, occurs in many versions and seems to be "well known" in the scala community. I don't think it makes sense to close the only (right?) active thread addressing this issue (I see the bug report on 2.13 that you referenced above was closed).
The issue you're referring to is scala/bug#8119 https://github.com/scala/bug/issues/8119 and there doesn't need to be more than one ticket on it.
That ticket is closed because it's already fixed for 2.13.2, which may ship as soon as this week. In the meantime, you can try out the fix in a nightly: https://stackoverflow.com/questions/40622878/how-do-i-tell-sbt-to-use-a-nightly-build-of-scala-2-12-or-2-13
This ticket — @nrinaudo https://github.com/nrinaudo's original report, above — is about a different problem.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/scala/scala-parallel-collections/issues/55#issuecomment-616177088, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI37NW7TMRXHLMGRLGFUACLRNMRVFANCNFSM4GUOOW4Q .
Not sure where to put this @SethTisue, but another "hint" for bug hunters is that the same behavior is occurring when trying to work with java.util.concurrent.ConcurrentHashMap, but only when the parallelism level is actually > 1. I'm using scalaVersion := "2.12.10"
in the REPL (IntelliJ console) the last call, where there's actually supposed to be some parallelism, hangs:
scala> val x = new ConcurrentHashMap[Int,Int](10000000)
x: java.util.concurrent.ConcurrentHashMap[Int,Int] = {}
scala> myTimer(for (i <- 1 to 5000000) x.put(i,i*i), "filling")
starting evaluation of filling...
done in 0.84584608s
scala>
myTimer(x.forEach(100000000, (k,v) => (k*math.sin(v)*math.cos(k)).toInt), "test")
starting evaluation of test...
done in 4.043585024s
scala>
myTimer(x.forEach(100000, (k,v) => (k*math.sin(v)*math.cos(k)).toInt), "test")
starting evaluation of test...
A very similar code wrapped inside a class extending App works (with a noticable 3x speedup on my 4 core Mac for the parallel execution).
yeah, https://github.com/scala/bug/issues/8119 isn't actually specific to the parallel collections, the parallel collections just happen to be the most common way that people run into it. your ConcurrentHashMap
example is another, and it doesn't hang the 2.13.2 REPL.
I have hidden many comments which turned out to be unrelated to the original bug report.
Before commenting on this ticket, please be very sure that what you are encountering is exactly the specific issue that Nicolas has identified, rather than just any starvation or deadlock issue.