elastic icon indicating copy to clipboard operation
elastic copied to clipboard

Client fails to handle context cancellation correctly

Open MusicAdam opened this issue 6 years ago • 3 comments

Please use the following questions as a guideline to help me answer your issue/question without further inquiry. Thank you.

Which version of Elastic are you using?

[ ] elastic.v7 (for Elasticsearch 7.x) [ ] elastic.v6 (for Elasticsearch 6.x) [ x] elastic.v5 (for Elasticsearch 5.x) [ ] elastic.v3 (for Elasticsearch 2.x) [ ] elastic.v2 (for Elasticsearch 1.x)

Please describe the expected behavior

I expect to cancelling a context set on the client should cancel a request when a constant backoff retrier is configured.

Please describe the actual behavior

The client is incorrectly handling net/http error and is therefore missing context cancellation/deadline timeout and attempts to retry.

Any steps to reproduce the behavior?

Configure a constant backoff retrier, create a count client with an invalid url (to trigger retry), cancel the context. You will see the retrier catching the error and thinking it can retry.

I have fixed this error by catching the underlying url error returned by net/http:

		// Get response
		res, err := c.c.Do((*http.Request)(req).WithContext(ctx))
		// net/http wraps context errors in url.Error, so parse context errors out for proper handling
		if urlErr, ok := err.(*url.Error); ok {
			if urlErr.Err == context.Canceled || urlErr.Err == context.DeadlineExceeded {
				// Proceed, but don't mark the node as dead
				return nil, urlErr.Err
			}
		}

I can open a MR, assuming I successfully identified the issue.

MusicAdam avatar Aug 05 '19 14:08 MusicAdam

Thanks. The elastic.v5 branch is no longer maintained, but we can port back the changes to v6 and v7. It looks like this is basically what we have here, right? And we need to change PerformRequest accordingly (find all occurrences of IsContextErr). If you would create a PR for that, I'd be happy to merge.

olivere avatar Aug 05 '19 14:08 olivere

BTW, please use errors.Is instead of comparing with ==.

I have found this issue researching how does this library handle calling Close on a background bulk processor, when it yet have to flush some documents being indexed, if Close is called as part of handling context cancellation, same context which was given to the bulk processor in the first place. So can a bulk processor operate if the context it was given has been cancelled? I think it should, so this bug in this issue is maybe a feature. :-)

mitar avatar Jan 24 '22 13:01 mitar

Hm, no. It seems closing fails with Post "http://172.17.0.2:9200/_bulk": context canceled.

mitar avatar Jan 24 '22 13:01 mitar