inception icon indicating copy to clipboard operation
inception copied to clipboard

Bulk-annotation action via search sidebar breaks when more than 100 documents are affected

Open reckart opened this issue 1 year ago • 2 comments

Describe the bug Bulk-annotation action via search sidebar breaks when more than 100 documents are affected. The reason is that for every affected document, an IndexAnnotationDocumentTask is scheduled and the scheduler queue by default has a maximum length of 100.

To Reproduce Steps to reproduce the behavior:

  1. Import 200 documents
  2. Perform a bulk-annotation action that affects all documents
  3. See error in log.

Expected behavior No error. Maybe we need to increase the queue length by default... or implement a BulkIndexAnnotationDocumentTask to alleviate the problem...

Screenshots

2024-04-16 10:18:49 ERROR [SYSTEM] TransactionSynchronizationUtils - TransactionSynchronization.afterCompletion threw exception
java.util.concurrent.RejectedExecutionException: Task IndexAnnotationDocumentTask [project=..., sourceDocument=null, annotationDocument=....xmi, trigger=afterAnnotationUpdate] rejected from de.tudarmstadt.ukp.inception.scheduling.InspectableThreadPoolExecutor@33ab6975[Running, pool size = 4, active threads = 4, queued tasks = 100, completed tasks = 10947]
	at java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2065) ~[?:?]
	at java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:833) ~[?:?]
	at java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1365) ~[?:?]
	at de.tudarmstadt.ukp.inception.scheduling.SchedulingServiceImpl.schedule(SchedulingServiceImpl.java:303) ~[inception-scheduling-32.0-SNAPSHOT.jar!/:?]
	at de.tudarmstadt.ukp.inception.scheduling.SchedulingServiceImpl.enqueue(SchedulingServiceImpl.java:260) ~[inception-scheduling-32.0-SNAPSHOT.jar!/:?]
	at de.tudarmstadt.ukp.inception.search.SearchServiceImpl.enqueue(SearchServiceImpl.java:887) ~[inception-search-core-32.0-SNAPSHOT.jar!/:?]
	at de.tudarmstadt.ukp.inception.search.SearchServiceImpl.enqueueIndexDocument(SearchServiceImpl.java:868) ~[inception-search-core-32.0-SNAPSHOT.jar!/:?]
	at de.tudarmstadt.ukp.inception.search.SearchServiceImpl.afterAnnotationUpdate(SearchServiceImpl.java:423) ~[inception-search-core-32.0-SNAPSHOT.jar!/:?]
	at de.tudarmstadt.ukp.inception.search.SearchServiceImpl$$FastClassBySpringCGLIB$$d6146f50.invoke(<generated>) ~[inception-search-core-32.0-SNAPSHOT.jar!/:?]

Please complete the following information:

  • Version and build ID: 32.0-beta-1

reckart avatar Apr 16 '24 09:04 reckart

This should probably be fixed by allowing scheduled indexing tasks to aggregate. E.g. if an IndexAnnotationDocumentTask is enqueued and a new one comes in for the same project and user, then the additional document should be added to that queued task.

This aggregation mechanism is a big more involved and requires some structural changes to the scheduling service, so we need to push this to a feature release.

reckart avatar Apr 30 '24 05:04 reckart

The workaround for people hitting this problem is to increase the scheduler queue size by adding e.g. the following line to the settings.properties file:

inception.scheduling.queue-size=2000

reckart avatar Apr 30 '24 05:04 reckart