aws-fluent-plugin-kinesis
aws-fluent-plugin-kinesis copied to clipboard
Secondary formatting
I'm trying to use a file/s3 output as a <secondary> if sending events to Kinesis is failing. But the way the plugin formats incoming events makes its harder to read from a secondary file or S3
Original message (json) -
{"test":"this is a test message", "message_id":"1234"}
{"test":"this is a test message", "message_id":"123456"}
{"test":"this is a test message", "message_id":"456789"}
Message in secondary output (file or S3) -
36cbd167849700a041b3bd691b014937íƒì{"test":"this is a test message", "message_id":"1234"}Ÿ c762173994c445b6927e569fa0821e6fíƒî{"test":"this is a test message", "message_id":"123456"}Ÿ 3cb59c12b2209ec3728795c6c58af6abíƒì{"test":"this is a test message", "message_id":"456789"}Ÿ
This is because of the format method implementation which adds a Hex of the event as the partition key
https://github.com/awslabs/aws-fluent-plugin-kinesis/blob/master/lib/fluent/plugin/out_kinesis_streams.rb#L39
Would it make sense to just format the message with the configured formatter in the format method and calculate the hex in the write method? That way secondary outputs can keep the desired formatting.
Environment -
td-agent3 running in a Ubuntu 14.04 container fluent-plugin-kinesis-2.1.1
config -
<match carting.*>
# plugin type
@log_level debug
@type kinesis_streams
# your kinesis stream name
stream_name test_stream
# aws region
region us-east-1
<buffer>
retry_max_times 3
flush_interval 10s
flush_thread_interval 0.1
flush_thread_burst_interval 0.01
flush_thread_count 4
</buffer>
<format>
@type json
</format>
<secondary>
@type file
path /fluentd/log/failed_events
<format>
@type json
</format>
</secondary>
</match>
facing the same issue too with https://docs.fluentd.org/output#secondary-output
where my logs look like
�={"key":"value"}�$3544c5eb-6536-11eb-8db1-0ea64c53eca3
3544c5eb-6536-11eb-8db1-0ea64c53eca3 is partition key