cats-effect icon indicating copy to clipboard operation
cats-effect copied to clipboard

Add parallelismFactorHint to Spawn[F]

Open bpholt opened this issue 11 months ago • 4 comments
trafficstars

Someone asked in Discord if there is a way to ask the CE runtime how many CPUs/compute threads it has access to, so that they can set parallelism factors appropriately. It's available in IOApp as computeWorkerThreadCount but this is specific to IOApp and therefore not easily usable from all the places in c.e.std that could benefit from it.

Once it's added to Spawn, new methods or overrides should be added setting default parallelism factors accordingly. For example, in addition to Random.scalaUtilRandomN, there should be a variant that sets N to the computeWorkerThreadCount.

There was some discussion of this in Discord, which I will attempt to summarize:

  1. Daniel originally suggested it as concurrencyFactorHint: Option[Int] (or parallelismFactorHint: Option[Int] in the interest of consistent terminology).
  2. Arman suggested avoiding the Option box by defaulting to 0, to which Daniel "didn't totally object," because "technically anything ≤ 0 is semantically invalid anyway" and "if we go with 0 as the default then the fallback could be to tap the runtime anyway"
  3. There was also some discussion about whether this should be F[Int] to reflect the reality that the number can change in some circumstances, but that opens up quite a rabbit hole, so it may not be worth it? If F[Int] is used, should there be some kind of notification protocol to let data structures optimized for a given value rebalance themselves if the value changes?

bpholt avatar Dec 12 '24 17:12 bpholt

For Spawn[IO], would this return availableProcessors() or the size of the WSTP? (Because the two is not necessarily the same.)

durban avatar Dec 12 '24 21:12 durban

The latter. The idea here is that the hint would help the user build downstream data structures which have striping strategies which are sensitive to the maximum true parallelism. Dispatcher and Random are two decent examples within std.

djspiewak avatar Dec 12 '24 22:12 djspiewak

In that case, I can't really see how to implement it as a : Int... return the size of what WSTP? There might not even exist one. (While if we're doing it as a : F[Int], we'd use the one we're running on. Although, that still could be a non-WSTP Executor.)

durban avatar Dec 12 '24 22:12 durban

This is a good point. I think this probably needs to be F[Int], and we should just blindly use the runtime that we're running on rather than trying to account for evalOn.

djspiewak avatar Dec 16 '24 19:12 djspiewak