fluent-logger-ruby
fluent-logger-ruby copied to clipboard
encoding issue UnicodeDecodeError
Something within the logging pipeline is breaking encoding, but just for some characters. I'm having a hard time reproducing this issue and i cannot pin point what is actually causing this but it seems that on the fluentd level, either the logger or fluentd itself.
I deployed fluentd in production and sending events from the Rails app using this logger. The logger is configured to send events to fluentd which sends it to S3 in as a gzip file. I then have a processing pipeline using these files and here is where i started seeing the issues.
client config
client = Fluent::Logger::FluentLogger.new(
nil,
host: "localhost",
port: 24224,
use_nonblock: true,
wait_writeable: false
)
client.post("foo", event)
fluentd config
<match foo.**>
@type s3
@id S3_output
s3_bucket my-bucket
s3_region us-east-1
acl bucket-owner-full-control
store_as gzip_command
path preprocessed_logs/year=%Y/month=%-m/day=%-d/hour=%-H
s3_object_key_format "%{path}/#{Socket.gethostname}_%{hex_random}_%{index}.%{file_extension}"
<buffer time>
timekey 300
timekey_use_utc true
timekey_wait 30
@type file
path /var/log/td-agent/buffer/foo
</buffer>
<format>
@type json
</format>
</match>
It seems that some characters are badly encoded. here is this user agent example:
'Mozilla/5.0 (iPhone; CPU iPhone OS 13_4_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Clube da Fluência'
was logged as:
'Mozilla/5.0 (iPhone; CPU iPhone OS 13_4_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Clube da Flu\xeancia'
ê
got changed to \xea
which is breaking decoding.
Do you think this might be something to do with how the logger is sending data to fluentd?
to add more context, I'm using this logger in a Rails app and what I log is requests informations. I have checked the Rails side of things and the string passed to the logger is UTF-8 encoded.
Fluentd treats data as a binary by default. If you hit the encoding problem, one way is convert encoding by using record_modifier or something.
https://docs.fluentd.org/quickstart/faq#i-got-encoding-error-inside-plugin-how-to-fix-it
@repeatedly thanks let me try this next week