elastic2-doc-manager
elastic2-doc-manager copied to clipboard
ES bulk method incomplete execution when errors
I mentioned this issue inside https://github.com/mongodb-labs/mongo-connector/issues/446 but taking into account that it is related to https://github.com/mongodb-labs/elastic2-doc-manager repository, I mention it here again When inside a bulk request there are failing operations the bulk method stops at failing operation even there are others operations to treat and an error is raised. These leads to inconsistency between mongo and ES.
The steps to reproduce this, are:
- Start full-dump of mongo-db.
- While dump is ongoing - delete some documents from MongoDb.
- As mongo-connector hasn't yet dumped mentioned document and document has been deleted in the meantime, it will not be added to ES.
- After dump - mongo-connector goes through oplog and it is trying to delete mentioned document again even that it has not been added to ES.
In order to solve this, I propose the folowing approch :
When ~elasticsearch.helpers.bulk or ~elasticsearch.helpers.streaming_bulk method are called set 'raise_on_error' and 'raise_on_exception' to False:
ex:
kw['raise_on_error'] = False
kw['raise_on_exception'] = False
successes, errors = bulk(self.elastic, action_buffer, **kw)
Like this, all errors of ES bulk opperations will be logged but the method call will not be interrupted by an exception.
Thanks a lot @sorhent for your solution, it works for me !