opentelemetry-collector
opentelemetry-collector copied to clipboard
Ensure OTLP receiver handles consume errors correctly
See the requirements for receivers here.
OTLP/HTTP receiver needs to:
- Return 429 or 503 responses on non-permanent errors.
- Return 400 response on permanent errors.
OTLP/gRCP receiver needs to:
- Return Unavailable on non-permanent errors.
- Return InvalidArgument response on permanent errors.
I think this issue might cover the idea for using gRPC error codes to have a more detailed description of cause of error returned to the client discussed during the Collector SIG today (in the context of backpressure handling). I'm not creating a new one. Let me know if you think otherwise
@tigrannajaryan may I work on this, for http it looks pretty simple. For gPRC, will think this through.
@VihasMakwana sure, it would be great if you can work on this.
Please make sure you read the latest OTLP spec to make sure your implementation matches it: https://github.com/open-telemetry/opentelemetry-proto/blob/main/docs/specification.md
You can also use receiver contract checker to test the behavior.
Ok, for testing, does the above contract checker cover the scenarios for network-based receivers? for otlp/http we just return 503/400 response. I think we also need a test suite that implements retry from the client side?
I think we also need a test suite that implements retry from the client side?
Yes, need a client that implements the retries correctly and use it in the test. I think having that you can then use the contract checker to verify that otlp sender+otlp receiver combo reacts correctly to the errors it receives from the pipeline.
At the very least please add coverage for your code changes to help maintain this code moving forward.
Did #8080 fix this?
@mx-psi Yes, i think we can close this.
OTLP-GRPC was addressed via #8080 and OTLP-HTTP was addressed via https://github.com/open-telemetry/opentelemetry-collector/pull/9357
Thanks for confirming!