math
math copied to clipboard
Avoid if-def branching for threading code
Description
Currently we if-def around single-core vs multi-core implementation of functions which implement parallelism. The if-def asks for the availability of STAN_THREADS being set or not during compilation. Since the TBB simplifies on a single core to just run with 1 process (if configured to do so), we should drop the if-def while ensuring that we instruct C++ users accordingly to setup the TBB task_arena things run in to just have 1 core.
Example
Code should be easier to maintain due to fewer if-def.
Expected Output
No change in outputs.
Current Version:
v4.1.0
Does single thread code using TBB exhibit the same performance as sequential code without TBB?
There are two cases to discriminate:
- The user defines
STAN_THREADS
... then then AD tape is thread local for which you pay a bit of performance (quite platform dependent). In this case the user anyway wants to use parallelism. - The user does not define
STAN_THREADS
... in that case you gain a bit of performance due to AD tape not being thread local. When a user now calls a TBB based function, then that function may incur a minor performance penalty which I would doubt that you can measure it. However, if a user intends to go sequential then he should anyway not call out to the TBB based functions.
So I think we are safe to simplify our code.