imagej-ops icon indicating copy to clipboard operation
imagej-ops copied to clipboard

Use the Context's recommended threading mechanism

Open ctrueden opened this issue 5 years ago • 2 comments

@bnorthan writes:

I just noticed that some ops set the number of threads using Runtime.getRuntime() would it be better for every Op to use the threading service and get the number of threads available from that?? For example dilate

All ops should use the ThreadService—and especially not use Runtime.getRuntime().availableProcessors()—when deciding how many threads to spawn. One challenge to this approach is that ThreadService does not currently have any sort of recommendedThreadCount() method. So for situations like dilate above, it is unclear how many threads to signify when delegating to the underlying algorithm implementation. Perhaps we should add such a method to the ThreadService. But ideally, every underlying algorithm implementation would accept an ExecutorService parameter for use in spawning threads, so that the ThreadService's corresponding ExecutorService could be passed directly.

ctrueden avatar Mar 20 '19 15:03 ctrueden

Hi @ctrueden

The common scenario I've run into is optimizing programs that have both high level (ie frames in an image can be processed in parallel) and low level (individual algorithms are implemented in parallel) parallelism. MKL provides a few functions to control the number of threads used.

In particular it has mkl_set_num_threads() and mkl_set_num_threads_local(). Which allows setting the number of OMP threads on an application level or thread level.

So it would be useful if Threads themselves have their own ThreadService and ExecutorService, so you could control how many sub-threads a thread itself can start.

bnorthan avatar Mar 21 '19 10:03 bnorthan

Thanks @bnorthan. See also imglib/imglib2-algorithm#81. @tpietzsch suggests:

ForkJoinPool supports work-stealing which would be important if submitted task spawn new subtasks for whose completion they wait. This allows handing down pool through algorithms that parallelise in chunks and for each chunk call another algorithm that parallelizes internally. (With handing down ExecutorService that wouldn't work.)

ctrueden avatar Mar 21 '19 13:03 ctrueden