ballerina-spec icon indicating copy to clipboard operation
ballerina-spec copied to clipboard

Add Scheduler thread pool size configuration to configurable variable

Open HindujaB opened this issue 4 years ago • 11 comments

Description: Proposed by @sanjiva @malithjayasinghe

Currently the maximum thread pool size configuration for runtime is provided through an environment variable BALLERINA_MAX_POOL_SIZE.

As such configurations can be now supported through configurable variables, we can have a configurable variable and provide the configuration via Config.toml.

Suggested Labels:

Code sample that shows issue:

Related Issues: https://github.com/ballerina-platform/ballerina-lang/issues/29729

HindujaB avatar May 03 '21 05:05 HindujaB

I propose adding a configurable variable in the lang.runtime module with name STRAND_THREAD_POOL_SIZE. https://github.com/ballerina-platform/ballerina-lang/pull/30153

HindujaB avatar May 03 '21 05:05 HindujaB

@HindujaB What's in lang.runtime is determined by the spec. Please don't unilaterally add things. In my view, implementation-dependent things like do not belong in langlib.

jclark avatar May 03 '21 06:05 jclark

I think the best place is to move this is, ballerina/jballerina.java or to a new module ballerina/jballerina.runtime

hasithaa avatar May 03 '21 06:05 hasithaa

Won't we need something similar in nballerina side as well? Spec already talks about threads

Each thread is divided into one or more strands.

In that case, I think it makes scene for the spec to define an upper bound on thread count, wdut?

manuranga avatar May 03 '21 07:05 manuranga

@manuranga, Even though spec talks about threads, language semantics are explained using strands. Right now we have strand [1] built-in annotation and which is JBallerina specific to relate threads with strands in the implementation. This discussion raised the question, whether we should move those to a JBallerina specific module.

[1] https://github.com/ballerina-platform/ballerina-lang/blob/master/langlib/lang.annotations/src/main/ballerina/annotations.bal#L80

hasithaa avatar May 03 '21 08:05 hasithaa

The language does not have a concept of thread pools.

jclark avatar May 03 '21 09:05 jclark

Ok, I can see how POOL can be misleading, and for some reason, "thread pool" phrase is strongly associated with Java.

Yet if we use a wording like MAX_THREAD_COUNT, I don't think it's out of place for the spec to talk about this, given that it already specifies strands are mapped to threads, and it can be generalized to both implementations.

This way we are also relaxing the requirement on the implantation to match the exact count or for it to retain and reuse threads, we are just setting an upper bound.

manuranga avatar May 03 '21 09:05 manuranga

The language spec tries to avoid unnecessarily constraining implementation choices. It's a perfectly valid implementation choice not to allow the user to explicitly constrain the maximum number of threads or to allow the user to constrain thread usage other than by specifying a fixed maximum number. The spec currently has nothing like this.

jclark avatar May 03 '21 10:05 jclark

What we want to achieve here is something similar to the GOMAXPROCS env variable in Go.

https://golang.org/pkg/runtime/#GOMAXPROCS

We should find a solution that works for both jBallerina and nBallerina here.

sameerajayasoma avatar May 04 '21 00:05 sameerajayasoma

I think by default you want the runtime not to have more OS threads running than there are hardware threads, so as to avoid unnecessary context switching, which is roughly what Go does. I think we need to examine the use cases where this default isn't giving satisfactory behaviour (is it too high? too low?), and figure out what it makes sense for the user to specify in order to get behaviour that meets their needs.

jclark avatar May 04 '21 01:05 jclark

We discussed this today, and one thing that emerged is that we should distinguish between:

  1. the number of threads running, and
  2. the number of threads blocked in system calls

Typically (1) should be constrained so that it is no greater than the number of hardware threads (so as to reduce OS context switches), whereas (2) is affected more by memory (used by the stacks of blocked threads) and OS-imposed limits on the numbers of OS threads (which differ between OSs). @sameerajayasoma mentioned that the configurable parameter was needed at the moment to address (2)

jclark avatar May 04 '21 05:05 jclark