InvalidRequest error when using S3 sink to CloudFlare R2
A note for the community
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
Problem
When I used the AWS S3 sink with CloudFlare R2 (full admin privilege access key), I got an InvalidRequest in the log, specifically:
2025-05-11T14:02:22.430273Z ERROR sink{component_kind="sink" component_id=ritsu_cf_r2 component_type=aws_s3}:request{request_id=1}: vector_common::internal_event::service: Service call failed. No retries or retries exhausted. error=Some(ServiceError(ServiceError { source: InvalidRequest(InvalidRequest { message: Some("You can only specify one non-default checksum at a time."), meta: ErrorMetadata { code: Some("InvalidRequest"), message: Some("You can only specify one non-default checksum at a time."), extras: None } }), raw: Response { status: StatusCode(400), headers: Headers { headers: {"date": HeaderValue { _private: H0("Sun, 11 May 2025 14:02:22 GMT") }, "content-type": HeaderValue { _private: H0("application/xml") }, "content-length": HeaderValue { _private: H0("155") }, "connection": HeaderValue { _private: H0("keep-alive") }, "server": HeaderValue { _private: H0("cloudflare") }, "cf-ray": HeaderValue { _private: H0("93e236b21d1b4973-SIN") }} }, body: SdkBody { inner: Once(Some(b"<?xml version=\"1.0\" encoding=\"UTF-8\"?><Error><Code>InvalidRequest</Code><Message>You can only specify one non-default checksum at a time.</Message></Error>")), retryable: true }, extensions: Extensions { extensions_02x: Extensions, extensions_1x: Extensions } } })) request_id=1 error_type="request_failed" stage="sending" internal_log_rate_limit=true
Upon inspecting the debugging log, I noticed there are multiple checksum headers set in the request.
/REDACTED-BUCKET/date%3D2025-05-111746972142-2f82f43b-edca-469c-87cd-60c8b730cdf1.log.zst
x-id=PutObject
content-encoding:zstd
content-length:117241
content-md5:H+Yn23ztZapuyNjQKtnrAw==
content-type:text/x-log
host:REDACTED-R2-HOST.r2.cloudflarestorage.com
x-amz-acl:private
x-amz-checksum-crc32:btOYUw==
x-amz-content-sha256:d5179ae36858b21c5814f85fc0afdb47f2c6177ae0d663ff0c559f7840075e9b
x-amz-date:20250511T140222Z
x-amz-sdk-checksum-algorithm:CRC32
x-amz-storage-class:STANDARD
x-amz-user-agent:aws-sdk-rust/1.3.6 ua/2.1 api/s3/1.82.0 os/linux lang/rust/1.85.1 m/U,Z,E
content-encoding;content-length;content-md5;content-type;host;x-amz-acl;x-amz-checksum-crc32;x-amz-content-sha256;x-amz-date;x-amz-sdk-checksum-algorithm;x-amz-storage-class;x-amz-user-agent
CloudFlare R2 has several unimplemented PutObject headers, so I thought the extra headers was the culprit.
Configuration
api:
enabled: true
address: 0.0.0.0:8686
sources:
prometheus:
type: prometheus_remote_write
address: 0.0.0.0:9090
sinks:
ritsu_local_file:
type: file
inputs:
- prometheus
path: "/remote-write/logs/prom-%Y-%m-%d.csv"
compression: zstd
encoding:
codec: json
ritsu_cf_r2:
type: aws_s3
inputs:
- prometheus
bucket: "{{ prometheus_remote_write.output_s3.bucket }}"
force_path_style: true
acl: private
auth:
access_key_id: "{{ prometheus_remote_write.output_s3.access_key_id }}"
secret_access_key: "{{ prometheus_remote_write.output_s3.secret_access_key }}"
endpoint: "{{ prometheus_remote_write.output_s3.endpoint }}"
region: "{{ prometheus_remote_write.output_s3.region }}"
buffer:
type: disk
when_full: "block"
max_size: 268435488 # 256 MB
compression: zstd
encoding:
codec: json
framing:
method: "newline_delimited"
Version
0.46.1
Debug Output
https://gist.github.com/fahminlb33/b876f464b42d5d97ef9f11b7b130e7d1
Example Data
No response
Additional Context
No response
References
No response
Interested if this is available to be picked up
Interested if this is available to be picked up
Hi @su-shivanshmathur, AFAIK no one is actively working on this. You are welcome to pick it up.
This change seems to be the issue?
- https://github.com/awslabs/aws-sdk-rust/issues/1240
- https://aws.amazon.com/blogs/aws/introducing-default-data-integrity-protections-for-new-objects-in-amazon-s3/
I followed the documentation (https://docs.aws.amazon.com/sdkref/latest/guide/feature-dataintegrity.html) to use request_checksum_calculation = when_required without any luck. I'm not sure if the Rust S3 SDK used by Vector is actually reading my global .aws/config or the corresponding env var for the same thing.