ArcticDB icon indicating copy to clipboard operation
ArcticDB copied to clipboard

Retry on transient Mongo errors

Open IvoDD opened this issue 8 months ago • 1 comments

Is your feature request related to a problem? Please describe. We have different retry mechanisms for different storages. E.g. S3's sdk does retry & backoff on retryable errors. However for other storages we're missing the needed retry logic. (E.g. Mongo storage doesn't retry on connection errors.)

Describe the solution you'd like Ideally we'd have a common exception handling mechanism which is independent of the storage. This way we can have common configuration parameters regarding the retries. One downside with this approach would be that the retry logic might not be optimal for the specific storage. E.g. default S3 exception handling will probably be better than what we can do which can work with all storages.

At the very least we should add some retry logic for the storages which don't have proper retry & backoff logic.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Internal slack thread discussing the issue here. In short: a user was getting E_UNEXPECTED_MONGO_ERROR and was asking whether they should retry on them. We decided to let them retry for now but we should properly fix our retry behavior soonish.

IvoDD avatar Jun 03 '24 07:06 IvoDD