opentelemetry-collector icon indicating copy to clipboard operation
opentelemetry-collector copied to clipboard

[exporter/otlphttp] Add non_retryable_status to skip retries for specific HTTP codes

Open farhan-pasha opened this issue 2 weeks ago • 1 comments

Description

Adds a new non_retryable_status configuration field to the OTLP HTTP exporter's retry_on_failure section, allowing users to specify HTTP status codes that should NOT trigger internal retries.

  • Problem: By default, the OTLP HTTP exporter retries on 429, 502, 503, and 504 per the OTLP specification. In gateway deployments, continuing to retry when backends return rate limits or temporary failures causes queue buildup and potential data loss.

  • Solution: This change allows treating normally-retryable codes as permanent errors, preventing internal retry loops.

Fixes #14228

Testing

  • Unit tests: Added test cases covering default behavior and custom non-retryable codes.
  • Config validation: Added TestConfigValidate with test cases for valid codes, empty list, and invalid codes (too low/high)
  • All existing tests passes

Documentation

  • README.md: Added configuration option description and example usage in gateway mode
  • testdata/config.yaml: Updated sample configuration with commented example

farhan-pasha avatar Dec 05 '25 13:12 farhan-pasha

CLA Signed
The committers listed above are authorized under a signed CLA.

  • :white_check_mark: login: farhan-pasha / name: Mohammad Farhan Pasha (a89de0fc50f0fb370d1f85c37bcb61dcd0c4dd99)

Hi everyone, just checking in - is there anything needed from my side to help move this PR forward? Happy to make any adjustments. Thanks! cc: @dmitryax

farhan-pasha avatar Dec 10 '25 18:12 farhan-pasha