logisland icon indicating copy to clipboard operation
logisland copied to clipboard

MongoDB and Solr Service sinks may retain some records under some conditions

Open mathieu-rossignol opened this issue 7 years ago • 1 comments

In MongoDB as well as Solr Services, we use the take() method on the BlockingQueue queue, waiting for incoming records. This may lead to some records not flushed into the sinks. Consider the following scenario: In a short period of time lower than the flush interval (and at the very startup of the service or just after a batch has been flushed) a bulkput processor sends few records but less than the batch size to the service. If there are no more immediate incoming records, and as we use the blocking method take(), those few records are blocked (in the prepared query waiting for processing) until some new records arrive.

Proposed fix: use the poll() method which has timeout support (as done for instance in the Chronix service)

MongoDB: https://github.com/Hurence/logisland/blob/develop/logisland-services/logisland-mongodb/logisland-mongodb-client-service/src/main/java/com/hurence/logisland/service/mongodb/MongoDBUpdater.java#L81 Solr: https://github.com/Hurence/logisland/blob/develop/logisland-services/logisland-solr-client-service/logisland-solr-client-service-api/src/main/java/com/hurence/logisland/service/solr/api/SolrUpdater.java#L52 Chronix: https://github.com/Hurence/logisland/blob/develop/logisland-services/logisland-solr-client-service/logisland-solr_6_4_2-chronix-client-service/src/main/java/com/hurence/logisland/service/solr/ChronixUpdater.java#L82

mathieu-rossignol avatar Aug 29 '18 15:08 mathieu-rossignol

MongoDB: #430 fixes the issue

amarziali avatar Nov 09 '18 15:11 amarziali