vector icon indicating copy to clipboard operation
vector copied to clipboard

InvalidRequest error when using S3 sink to CloudFlare R2

Open fahminlb33 opened this issue 7 months ago • 2 comments

A note for the community

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Problem

When I used the AWS S3 sink with CloudFlare R2 (full admin privilege access key), I got an InvalidRequest in the log, specifically:

2025-05-11T14:02:22.430273Z ERROR sink{component_kind="sink" component_id=ritsu_cf_r2 component_type=aws_s3}:request{request_id=1}: vector_common::internal_event::service: Service call failed. No retries or retries exhausted. error=Some(ServiceError(ServiceError { source: InvalidRequest(InvalidRequest { message: Some("You can only specify one non-default checksum at a time."), meta: ErrorMetadata { code: Some("InvalidRequest"), message: Some("You can only specify one non-default checksum at a time."), extras: None } }), raw: Response { status: StatusCode(400), headers: Headers { headers: {"date": HeaderValue { _private: H0("Sun, 11 May 2025 14:02:22 GMT") }, "content-type": HeaderValue { _private: H0("application/xml") }, "content-length": HeaderValue { _private: H0("155") }, "connection": HeaderValue { _private: H0("keep-alive") }, "server": HeaderValue { _private: H0("cloudflare") }, "cf-ray": HeaderValue { _private: H0("93e236b21d1b4973-SIN") }} }, body: SdkBody { inner: Once(Some(b"<?xml version=\"1.0\" encoding=\"UTF-8\"?><Error><Code>InvalidRequest</Code><Message>You can only specify one non-default checksum at a time.</Message></Error>")), retryable: true }, extensions: Extensions { extensions_02x: Extensions, extensions_1x: Extensions } } })) request_id=1 error_type="request_failed" stage="sending" internal_log_rate_limit=true

Upon inspecting the debugging log, I noticed there are multiple checksum headers set in the request.

/REDACTED-BUCKET/date%3D2025-05-111746972142-2f82f43b-edca-469c-87cd-60c8b730cdf1.log.zst
x-id=PutObject
content-encoding:zstd
content-length:117241
content-md5:H+Yn23ztZapuyNjQKtnrAw==
content-type:text/x-log
host:REDACTED-R2-HOST.r2.cloudflarestorage.com
x-amz-acl:private
x-amz-checksum-crc32:btOYUw==
x-amz-content-sha256:d5179ae36858b21c5814f85fc0afdb47f2c6177ae0d663ff0c559f7840075e9b
x-amz-date:20250511T140222Z
x-amz-sdk-checksum-algorithm:CRC32
x-amz-storage-class:STANDARD
x-amz-user-agent:aws-sdk-rust/1.3.6 ua/2.1 api/s3/1.82.0 os/linux lang/rust/1.85.1 m/U,Z,E

content-encoding;content-length;content-md5;content-type;host;x-amz-acl;x-amz-checksum-crc32;x-amz-content-sha256;x-amz-date;x-amz-sdk-checksum-algorithm;x-amz-storage-class;x-amz-user-agent

CloudFlare R2 has several unimplemented PutObject headers, so I thought the extra headers was the culprit.

Configuration

api:
  enabled: true
  address: 0.0.0.0:8686

sources:
  prometheus:
    type: prometheus_remote_write
    address: 0.0.0.0:9090

sinks:
  ritsu_local_file:
    type: file
    inputs:
      - prometheus
    path: "/remote-write/logs/prom-%Y-%m-%d.csv"
    compression: zstd
    encoding:
      codec: json

  ritsu_cf_r2:
    type: aws_s3
    inputs:
      - prometheus
    bucket: "{{ prometheus_remote_write.output_s3.bucket }}"
    force_path_style: true
    acl: private
    auth:
      access_key_id: "{{ prometheus_remote_write.output_s3.access_key_id }}"
      secret_access_key: "{{ prometheus_remote_write.output_s3.secret_access_key }}"
    endpoint: "{{ prometheus_remote_write.output_s3.endpoint }}"
    region: "{{ prometheus_remote_write.output_s3.region }}"
    buffer:
      type: disk
      when_full: "block"
      max_size: 268435488 # 256 MB
    compression: zstd
    encoding:
      codec: json
    framing:
      method: "newline_delimited"

Version

0.46.1

Debug Output

https://gist.github.com/fahminlb33/b876f464b42d5d97ef9f11b7b130e7d1

Example Data

No response

Additional Context

No response

References

No response

fahminlb33 avatar May 11 '25 14:05 fahminlb33

Interested if this is available to be picked up

su-shivanshmathur avatar May 18 '25 11:05 su-shivanshmathur

Interested if this is available to be picked up

Hi @su-shivanshmathur, AFAIK no one is actively working on this. You are welcome to pick it up.

pront avatar Jun 10 '25 19:06 pront

This change seems to be the issue?

  • https://github.com/awslabs/aws-sdk-rust/issues/1240
  • https://aws.amazon.com/blogs/aws/introducing-default-data-integrity-protections-for-new-objects-in-amazon-s3/

I followed the documentation (https://docs.aws.amazon.com/sdkref/latest/guide/feature-dataintegrity.html) to use request_checksum_calculation = when_required without any luck. I'm not sure if the Rust S3 SDK used by Vector is actually reading my global .aws/config or the corresponding env var for the same thing.

rpsirois avatar Jul 28 '25 02:07 rpsirois