opengrok
opengrok copied to clipboard
fail fast in indexParallel() on ExecutionException
If an index task fails in indexParallel() with InterruptedException or ExecutionException exception, it will be detected only once the cycle gets to the respective future when traversing the list and calls future.get():
https://github.com/oracle/opengrok/blob/9164e39b147e2cb3fed3d4200e0e4cf3d51910d1/opengrok-indexer/src/main/java/org/opengrok/indexer/index/IndexDatabase.java#L2003-L2006
Ideally, this should fail fast, i.e. the first future from the list which exhibited the exception should terminate the processing for given index database.
Came across this when working on PR #4706.
Calling parallelizer.getIndexWorkExecutor().shutdownNow() in the catch block below the cycle is probably not feasible as it might impact indexing of other projects. On the other hand if the indexer fails with OOM (ExecutionException) then all bets are off.
Another idea is to use ExecutorCompletionService so that it is possible to retrieve the already complete Future objects. Then the rest of the futures submitted in indexParallel() can be cancel()ed in case of ExecutionException.
The only caveat is that multiple IndexDatabase#update()s can run in parallel (from Indexer#doIndexerExecution()) and there is a single ExecutorCompletionService (instantiated in and retrieved via IndexParallelizer so that all indexing can be capped by thread pool size) has only single queue of futures so it can happen that futures which belong to distinct IndexDatabase could be returned. Hence, the BlockingQueue and ExecutorCompletionService would need to be extended/reimplemented (plus related interfaces) to provide filtering in take(tag) based on a tag.
Also, ExecutorCompletionService does not seem to have any of the shutdown methods like ExecutionService so that would need changes in IndexParallelizer#bounceIndexWorkExecutor().