azure_blob failed to upload logs even when acknowledgements are enabled
A note for the community
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
Problem
I'm testing Vector's azure_blob to upload logs from Kafka. I realized ERROR logs like below, and when that happens, I actually miss the logs from Azure Blob. I've set acknowledgement as enabled, but Vector just keeps on uploading the new logs, so in the end, the log /container/topics/audit-logs-v2/year=2023/month=12/day=31/vector-1703996260-6d0af193-9f7d-4671-91e6-be18466bd55a.log.gz was not uploaded, and we lost the log.
2023-12-31T04:17:58.524831Z ERROR sink{component_kind="sink" component_id=azure_blob_basic component_type=azure_blob}:request{request_id=8679}: vector::sinks::util::retries: Unexpected error type; dropping the request. error=failed to execute `reqwest` request internal_log_rate_limit=true
2023-12-31T04:17:58.524869Z WARN sink{component_kind="sink" component_id=azure_blob_basic component_type=azure_blob}:request{request_id=8679}: vector::sinks::util::adaptive_concurrency::controller: Unhandled error response. error=failed to execute `reqwest` request internal_log_rate_limit=true
2023-12-31T04:17:58.532435Z ERROR sink{component_kind="sink" component_id=azure_blob_basic component_type=azure_blob}:request{request_id=8679}: vector_common::internal_event::service: Service call failed. No retries or retries exhausted. error=Some(Error { context: Full(Custom { kind: Io, error: reqwest::Error { kind: Request, url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("xxx.blob.core.windows.net")), port: None, path: "/container/topics/audit-logs-v2/year=2023/month=12/day=31/vector-1703996260-6d0af193-9f7d-4671-91e6-be18466bd55a.log.gz", query: None, fragment: None }, source: hyper::Error(Io, Os { code: 104, kind: ConnectionReset, message: "Connection reset by peer" }) } }, "failed to execute `reqwest` request") }) request_id=8679 error_type="request_failed" stage="sending" internal_log_rate_limit=true
2023-12-31T04:17:58.532465Z ERROR sink{component_kind="sink" component_id=azure_blob_basic component_type=azure_blob}:request{request_id=8679}: vector_common::internal_event::component_events_dropped: Events dropped intentional=false count=337172 reason="Service call failed. No retries or retries exhausted." internal_log_rate_limit=true
I think this is due to Vector mistreats the event and acknowledgement is processed even in this ERROR.
Additionally, Vector ensures that the batch notifier for an event is always updated, whether or not the event made it to a sink. This ensures that if an event is intentionally dropped (for example, by using a [filter][filter] transform) or even unintentionally dropped (maybe Vector had a bug, uh oh!), we still update the batch notifier to indicate the processing status of the event.`
https://vector.dev/docs/about/under-the-hood/architecture/end-to-end-acknowledgements/
Configuration
sources: {
kafka_sink_basic: {
auto_offset_reset: "latest",
type: "kafka",
bootstrap_servers: bootstrap_url,
group_id: "vector-basic",
topics: [ "audit-logs" ],
},
},
sinks: {
azure_blob_basic: {
type: "azure_blob",
connection_string: "${CON_STRING}",
endpoint: endpoint_url,
container_name: "container",
inputs: [ 'kafka_sink_basic' ],
blob_prefix: "topics/audit-logs-v2/year=%Y/month=%m/day=%d/vector-",
batch: {
max_bytes: config.basicMaxBytes,
timeout_secs: config.basicTimeoutSecs,
},
encoding: {
codec: "raw_message",
},
buffer: {
when_full: "block",
},
acknowledgements: {
enabled: true,
},
compression: "gzip",
},
Version
0.34.1
Debug Output
No response
Example Data
No response
Additional Context
No response
References
No response
Related https://github.com/vectordotdev/vector/issues/10870