Final newline character missing when using AWS S3 sink
A note for the community
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
Problem
When using the AWS S3 sink to upload local logs, the last line of each uploaded object is missing a newline character at the end. This is not the case when using the File sink.
It seems to be caused by the S3 sink calling encoder.serialize on the last event of the batch (https://github.com/vectordotdev/vector/blob/master/src/sinks/util/encoding.rs#L51) while the file sink calls encoder.encode on all events (https://github.com/vectordotdev/vector/blob/master/src/sinks/file/mod.rs#L427).
I would expect these to behave in the same way using the provided config.
Configuration
[sources.messages]
type = "file"
include = [
"/var/log/messages",
]
[sinks.s3]
inputs = ["messages"]
type = "aws_s3"
bucket = "<redacted>"
region = "us-east-1"
key_prefix = "<redacted>"
healthcheck.enabled = false
filename_append_uuid = false
filename_extension = ""
compression = "none"
encoding.codec = "text"
auth.access_key_id = "<redacted>"
auth.secret_access_key = "<redacted>"
batch.timeout_secs = 60
[sinks.file]
type = "file"
inputs = [ "messages" ]
path = "/tmp/messages-%Y%m%d-%H%M%S.log"
encoding.codec = "text"
Version
vector 0.40.0 (aarch64-unknown-linux-gnu 1167aa9 2024-07-29 15:08:44.028365803)
Debug Output
No response
Example Data
No response
Additional Context
No response
References
No response
Good spot @miquelruiz
I think https://github.com/vectordotdev/vector/blob/f7e4470f6e1a415c8b93fbc290ad9c6d05d77f14/src/sinks/util/encoding.rs#L48-L59 is attempting to handle proper batch encoding when the batch is being encoded as a JSON array. If encode is used, a , is appended to each element as it is encoded, but when encoding as a JSON array the trailing , isn't wanted which is why serialize is used instead.
I think a possible fix for this would be to update:
https://github.com/vectordotdev/vector/blob/f7e4470f6e1a415c8b93fbc290ad9c6d05d77f14/src/codecs/encoding/encoder.rs#L96-L104
To have it add a \n if the framer is Framer::NewlineDelimited.
Opened a PR: https://github.com/vectordotdev/vector/pull/21097