scala-parallel-collections icon indicating copy to clipboard operation
scala-parallel-collections copied to clipboard

Starvation situations with fixed thread pools

Open nrinaudo opened this issue 6 years ago • 21 comments

Parallel collections that use a fixed thread pool for task support and contain "too many" elements will deadlock, where "too many" is a value I haven't been able to qualify.

The simplest possible reproduction is the empty list with a thread pool of 1:

import scala.concurrent.ExecutionContext
import java.util.concurrent.Executors
import scala.collection.parallel.CollectionConverters._
import scala.collection.parallel.ExecutionContextTaskSupport

val col = List.empty[Int].par

col.tasksupport = new ExecutionContextTaskSupport(
  ExecutionContext.fromExecutorService(Executors.newFixedThreadPool(1))
)

// This will deadlock
col.map(_ + 1)

I have observed the same starvation issue with larger thread pools and lists, but not reliably enough to provide a reproduction case.

nrinaudo avatar Feb 05 '19 16:02 nrinaudo

@axel22 should https://docs.scala-lang.org/overviews/parallel-collections/configuration.html say something about this?

SethTisue avatar Feb 05 '19 16:02 SethTisue

Maybe we should add a recommendation to use only ForkJoinPools to that documentation.

axel22 avatar Feb 05 '19 16:02 axel22

Isn’t that hiding the symptoms though? Sure, it’s best to use a ForkJoinPool, but we still have perfectly legal code that deadlocks

nrinaudo avatar Feb 05 '19 17:02 nrinaudo

This seems to stem from the implementation of

scala.collection.parallel.FutureTasks vs the implementation of scala.collection.parallel.ParIterableLike.ResultMapping

The issue is that

  1. The FutureTasks logic first determines the parallelism based on the number of cores on the machine (in my case this was 6).

  2. It then creates a computation tree based on the max depth calculated by this: private val maxdepth = (math.log(parallelismLevel) / math.log(2) + 1).toInt

  3. This will create up to 2^maxDepth Futures (based on the number of elements in the collection) Each of this futures calls scala.collection.parallel.Task#tryLeaf

  4. The tryLeaf method eventually calls through to scala.collection.parallel.Task#leaf. The implementation of leaf for ResultMapping scala.collection.parallel.ParIterableLike.ResultMapping#leaf makes a blocking call to tasksupport.executeAndWaitResult(inner)

  5. In the case of a single thread, this means that the thread that is waiting for the task result is blocked because there is no thread available to process the scala.collection.parallel.ParIterableLike.StrictSplitterCheckTask

In theory, if the number of threads you define for your executor is: (2 ^ (log(parallelismLevel) / log(2) + 1).toInt) + 1 then it shouldn't get deadlocked.... i think

Stack trace from an experiment.

java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
scala.concurrent.impl.Promise$DefaultPromise.tryAwait(Promise.scala:242)
scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:258)
scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:263)
scala.concurrent.Await$.$anonfun$result$1(package.scala:219)
scala.concurrent.Await$$$Lambda$1274/950805155.apply(Unknown Source)
scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:57)
scala.concurrent.Await$.result(package.scala:146)
scala.collection.parallel.FutureTasks.$anonfun$execute$3(Tasks.scala:513)
scala.collection.parallel.FutureTasks$$Lambda$1256/656809460.apply(Unknown Source)
scala.collection.parallel.FutureTasks.executeAndWaitResult(Tasks.scala:519)
scala.collection.parallel.ExecutionContextTasks.executeAndWaitResult(Tasks.scala:555)
scala.collection.parallel.ExecutionContextTasks.executeAndWaitResult$(Tasks.scala:555)
scala.collection.parallel.ExecutionContextTaskSupport.executeAndWaitResult(TaskSupport.scala:84)
scala.collection.parallel.ParIterableLike$ResultMapping.leaf(ParIterableLike.scala:960)
scala.collection.parallel.Task.$anonfun$tryLeaf$1(Tasks.scala:53)
scala.collection.parallel.Task$$Lambda$1257/1697135480.apply$mcV$sp(Unknown Source)
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
scala.util.control.Breaks$$anon$1.catchBreak(Breaks.scala:67)
scala.collection.parallel.Task.tryLeaf(Tasks.scala:56)
scala.collection.parallel.Task.tryLeaf$(Tasks.scala:50)
scala.collection.parallel.ParIterableLike$ResultMapping.tryLeaf(ParIterableLike.scala:955)
scala.collection.parallel.FutureTasks.$anonfun$exec$5(Tasks.scala:499)
scala.collection.parallel.FutureTasks$$Lambda$1253/1941666034.apply(Unknown Source)
scala.concurrent.Future$.$anonfun$apply$1(Future.scala:658)
scala.collection.parallel.FutureTasks$$Lambda$1254/1629179184.apply(Unknown Source)
scala.util.Success.$anonfun$map$1(Try.scala:255)
scala.util.Success.map(Try.scala:213)
scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
scala.concurrent.Future$$Lambda$1205/1115425820.apply(Unknown Source)
scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
scala.concurrent.impl.Promise$$Lambda$1206/183285557.apply(Unknown Source)
scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)

javax-swing avatar Feb 06 '19 10:02 javax-swing

I've also encountered this problem. Perhaps the following observation will provide a hint to someone: in Scala 2.12 applying a map to some parallel vector worked in the REPL but consistently failed when the same command was inside an object defined in a script. The same map seems to consistently fail in 2.13 regardless of whether I try to run it in the REPL or the script.

amitainz avatar Mar 11 '20 20:03 amitainz

@javax-swing, @axel22 for me it renders the whole parallel collections module useless, so I'm quite surprised this thread seems inactive for over a year - am I missing something? is there some secret workaround? I'm really looking forward to using parallel collections in my code.

amitainz avatar Mar 11 '20 20:03 amitainz

@amitainz - You could try to transition to using Java 8 Streams. Although they don't support every possible collection, they generally perform quite well, and there are adapters for the major Scala collections. See https://www.scala-lang.org/api/current/scala/jdk/StreamConverters$.html

Ichoran avatar Mar 12 '20 04:03 Ichoran

@amitainz - You could try to transition to using Java 8 Streams. Although they don't support every possible collection, they generally perform quite well, and there are adapters for the major Scala collections. See https://www.scala-lang.org/api/current/scala/jdk/StreamConverters$.html

Thanks @Ichoran , I'll take a look.

amitainz avatar Mar 12 '20 07:03 amitainz

At least in Scala 2.12.10, I managed to make things work by making my object extend the App trait. So that when I run: Snippets.main(Array[String]()) the code runs as expected. Note if I try to access the ParVector instance inside Snippets and apply map to it on the sbt REPL, it hangs. Still, I thought this might be useful for other users / provide some clue for any developer who wants to fix this bug.

amitainz avatar Mar 16 '20 15:03 amitainz

I as well to my chagrin have discovered that previously working code that worked famously in 2.10 and 2.11 for almost a decade hangs in scala 2.12.

Should we completely abandon our code that uses parallel collections in our effort to get on 2.12? And re-write it to use simple futures? Seems like yes

(You might ask "why still on 2.12 in 2020?" Yes, well - it's issues like this - causing mysterious deadlocks that take down our application - that put our 2.12 upgrade effort on hold for months after we burn through our timeboxed effort to upgrade)

noahlz avatar Apr 19 '20 05:04 noahlz

@noahlz I'm a newbie to Scala, so probably should keep out of such strategic discussions. But for what it's worth, the parallel collections were the number one reason I decided to give scala a try. I mean, the language is elegant and expressive, sure - but I was looking for something that will actually save me worrying about the parallelization. The second reason was that it is reasonably fast (I got tired of looking for inventive numpy hacks to do what I want in python), but there are plenty of other languages in that category.

amitainz avatar Apr 19 '20 06:04 amitainz

Do we actually know whether @nrinaudo's original report is 2.12+-only?

I don't know for sure, but I suspect this ticket has become a grab bag of "parallel collections didn't work for me, in my code" reports that may or may not have anything to do with each other.

In particular, I suspect overlap with https://github.com/scala/bug/issues/8119

SethTisue avatar Apr 19 '20 15:04 SethTisue

@SethTisue - it certainly isn't just a problem in 2.12. I don't remember all the scala versions I've tried, but 2.13 definitely has this / a very similar problem, as well as some older versions. You're probably right that a thorough check of all the versions and mapping of this bug is called for, but it does seem that all these issues stem from the same type of deadlocking behavior.

I wouldn't categorize them as "parallel collections didn't work for me in my code". It's a very specific deadlocking issue that's very easy to reproduce, occurs in many versions and seems to be "well known" in the scala community. I don't think it makes sense to close the only (right?) active thread addressing this issue (I see the bug report on 2.13 that you referenced above was closed).

amitainz avatar Apr 19 '20 16:04 amitainz

it certainly isn't just a problem in 2.12. I don't remember all the scala versions I've tried, but 2.13 definitely has this / a very similar problem,

I know. That's why I said "2.12+", not just "2.12". There is a crucial dividing line between 2.11 and 2.12, because the lambda encoding changed. (There is also an important dividing line between 2.12 and 2.13, namely the collections overhaul, and the relocation of the parallel collections to an external module.)

I don't think it makes sense to close the only (right?) active thread addressing this issue (I see the bug report on 2.13 that you referenced above was closed).

. It's a very specific deadlocking issue that's very easy to reproduce, occurs in many versions and seems to be "well known" in the scala community. I don't think it makes sense to close the only (right?) active thread addressing this issue (I see the bug report on 2.13 that you referenced above was closed).

The issue you seem to be referring to is scala/bug#8119 and there doesn't need to be more than one ticket on it.

That ticket is closed because it's already fixed for 2.13.2, which may ship as soon as this week. In the meantime, you can try out the fix in a nightly: https://stackoverflow.com/questions/40622878/how-do-i-tell-sbt-to-use-a-nightly-build-of-scala-2-12-or-2-13

This ticket — @nrinaudo's original report, above — is about a different problem.

It is important for all of us to be maximally precise about exactly which problems exist in exactly which Scala versions and how exactly those problems can be reproduced.

SethTisue avatar Apr 19 '20 16:04 SethTisue

👍 this is not a case of "parallel collections didn't work for me in my code" see also scala/bug#8955 this has come up before and seems not fixed.

Example code

 $ console
[info] Starting scala interpreter...
[info]
Welcome to Scala 2.12.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_191).
Type in expressions for evaluation. Or try :help.

scala> (1 to 1000).toList.par.foreach(i=> println(Thread.currentThread.getName + s" = $i"))

hangs

In 2.11

$ +console
Welcome to Scala 2.11.12 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_191).
Type in expressions for evaluation. Or try :help.

scala> (1 to 1000).toList.par.foreach(i=> println(Thread.currentThread.getName + s" = $i"))
ForkJoinPool-1-worker-7 = 251
ForkJoinPool-1-worker-13 = 1
ForkJoinPool-1-worker-13 = 2
ForkJoinPool-1-worker-13 = 3
ForkJoinPool-1-worker-13 = 4
... etc

Like I said, I'm usually met with disbelief when I say our team is still on 2.11 but it's things like this hold us up, a real shame. Fortunately, we just have to rip out parallel collections from our code and replace with a Thread Pool Executor execution context + Futures.

noahlz avatar Apr 19 '20 16:04 noahlz

@noahlz https://github.com/scala/bug/issues/8955 was fixed a long, long time ago. if you're going to bring that up, you need to provide specific evidence that it isn't fixed

the transcript you've included demonstrates a different bug, namely https://github.com/scala/bug/issues/8119. if you try a 2.13 nightly, as I suggested above, you'll find that it is fixed there, for the upcoming 2.13.2 release

none of this has anything to do with this ticket, Nicolas's bug report about thread pools — at least, I don't think it does and nobody has presented evidence that it does

SethTisue avatar Apr 19 '20 16:04 SethTisue

@SethTisue thank you for clarifing! 👍

noahlz avatar Apr 20 '20 03:04 noahlz

Ok, I didn't realize that bug was fixed - and thanks for the nightly tip. (for now I have some setup that works as long as I don't try to work with parallel collections in the REPL, so I'll just wait for the official rollout).

On Sun, Apr 19, 2020 at 7:33 PM Seth Tisue [email protected] wrote:

it certainly isn't just a problem in 2.12. I don't remember all the scala versions I've tried, but 2.13 definitely has this / a very similar problem,

That's why I said "2.12+". There is a crucial dividing line between 2.11 and 2.12, because the lambda encoding changed.

I don't think it makes sense to close the only (right?) active thread addressing this issue (I see the bug report on 2.13 that you referenced above was closed).

. It's a very specific deadlocking issue that's very easy to reproduce, occurs in many versions and seems to be "well known" in the scala community. I don't think it makes sense to close the only (right?) active thread addressing this issue (I see the bug report on 2.13 that you referenced above was closed).

The issue you're referring to is scala/bug#8119 https://github.com/scala/bug/issues/8119 and there doesn't need to be more than one ticket on it.

That ticket is closed because it's already fixed for 2.13.2, which may ship as soon as this week. In the meantime, you can try out the fix in a nightly: https://stackoverflow.com/questions/40622878/how-do-i-tell-sbt-to-use-a-nightly-build-of-scala-2-12-or-2-13

This ticket — @nrinaudo https://github.com/nrinaudo's original report, above — is about a different problem.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/scala/scala-parallel-collections/issues/55#issuecomment-616177088, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI37NW7TMRXHLMGRLGFUACLRNMRVFANCNFSM4GUOOW4Q .

amitainz avatar Apr 20 '20 11:04 amitainz

Not sure where to put this @SethTisue, but another "hint" for bug hunters is that the same behavior is occurring when trying to work with java.util.concurrent.ConcurrentHashMap, but only when the parallelism level is actually > 1. I'm using scalaVersion := "2.12.10"

in the REPL (IntelliJ console) the last call, where there's actually supposed to be some parallelism, hangs:

scala> val x = new ConcurrentHashMap[Int,Int](10000000)
x: java.util.concurrent.ConcurrentHashMap[Int,Int] = {}

scala> myTimer(for (i <- 1 to 5000000) x.put(i,i*i), "filling") 
starting evaluation of filling...
done in 0.84584608s

scala> 
myTimer(x.forEach(100000000, (k,v) => (k*math.sin(v)*math.cos(k)).toInt), "test")
starting evaluation of test...
done in 4.043585024s

scala> 
myTimer(x.forEach(100000, (k,v) => (k*math.sin(v)*math.cos(k)).toInt), "test")
starting evaluation of test...

A very similar code wrapped inside a class extending App works (with a noticable 3x speedup on my 4 core Mac for the parallel execution).

amitainz avatar Apr 21 '20 12:04 amitainz

yeah, https://github.com/scala/bug/issues/8119 isn't actually specific to the parallel collections, the parallel collections just happen to be the most common way that people run into it. your ConcurrentHashMap example is another, and it doesn't hang the 2.13.2 REPL.

SethTisue avatar Apr 21 '20 13:04 SethTisue

I have hidden many comments which turned out to be unrelated to the original bug report.

Before commenting on this ticket, please be very sure that what you are encountering is exactly the specific issue that Nicolas has identified, rather than just any starvation or deadlock issue.

SethTisue avatar Nov 14 '21 21:11 SethTisue