Emoji got UNICODE-escaped for file output when running fluent-bit with Docker
Bug Report
Describe the bug
We have emojis in the log message. The emojis are carried forward in the pipeline at fluent-bit appropriately. However, in the last step, if the output is s3 or file, the emojis got UNICODE-escaped in the produced file.
To Reproduce
Create a simple test input file input.log, which contains only one line:
🌈
Run fluent-bit with Docker using file output:
docker run -it --rm -v $(pwd):/data fluent/fluent-bit:2.2.2 /fluent-bit/bin/fluent-bit -i tail -p read_from_head=true -p exit_on_eof=true -p path=/data/input.log -o file -p path=/data -p file=output.log --verbose
In the output.log file, the emoji is UNICODE-escaped:
tail.0: [1708941282.353294128, {"log":"\u1f308"}]
Expected behavior
The output should be:
tail.0: [1708941282.353294128, {"log":"🌈"}]
Your Environment
- Version used: 2.2.2
- Operating System and version: macOS Sonoma 14.3.1, but this might not be relevant
- Filters and plugins: None
- Docker version:
Client: Docker Engine - Community Version: 25.0.3 API version: 1.44 Go version: go1.21.6 Git commit: 4debf411d1 Built: Tue Feb 6 20:42:40 2024 OS/Arch: darwin/arm64 Context: default Server: Docker Engine - Community Engine: Version: 25.0.3 API version: 1.44 (minimum version 1.24) Go version: go1.21.6 Git commit: f417435 Built: Tue Feb 6 21:14:58 2024 OS/Arch: linux/arm64 Experimental: false containerd: Version: 1.6.28 GitCommit: ae07eda36dd25f8a1b98dfbf587313b99c0190bb runc: Version: 1.1.12 GitCommit: v1.1.12-0-g51d5e94 docker-init: Version: 0.19.0 GitCommit: de40ad0
Additional context
Still taking the input.log example, I succeeded with some other scenarios:
Case 1: run fluent-bit with Docker using stdout as output. --Work as expected
Command:
docker run -it --rm -v $(pwd):/data fluent/fluent-bit:2.2.2 /fluent-bit/bin/fluent-bit -i tail -p read_from_head=true -p exit_on_eof=true -p path=/data/input.log -o stdout
Output (from the log):
[0] tail.0: [[1708941170.693859307, {}], {"log"=>"🌈"}]
Case 2: run fluent-bit without Docker using file as output. --Work as expected
Command:
fluent-bit -i tail -p read_from_head=true -p exit_on_eof=true -p path=input.log -o file -p file=output.log --verbose
Output (from output.log file):
tail.0: [1708941936.998975000, {"log":"🌈"}]
I also tried to build the image by taking debian:bullseye-slim as the base image. Installed locales:
apt-get -qq install --no-install-recommends locales && \
echo "en_US.UTF-8 UTF-8" > /etc/locale.gen && \
dpkg-reconfigure --frontend=noninteractive locales && \
update-locale LANG=en_US.UTF-8
and set the environment variables accordingly in the Dockerfile:
ENV LANG en_US.UTF-8
ENV LANGUAGE en_US.UTF-8
ENV LC_ALL en_US.UTF-8
It did not change the result.
Confirmed that the issue exists for the following env.
OS: Amazon Linux 2023 arm
fluent bit: [fluent bit] version=3.0.3
BTW, if I change the os arch to x86, the issue disappeared...
Build with -fsigned-char on arm resolves this issue.
@RamaMalladiAWS sorry, that's a cmake flag right? I think we need to add this to AWS for Fluent Bit distro. Would you like to submit the github commit for the diff of the change?
@RamaMalladiAWS sorry, that's a cmake flag right? I think we need to add this to AWS for Fluent Bit distro. Would you like to submit the github commit for the diff of the change?
Yes, I can do.
I submitted PR: https://github.com/fluent/fluent-bit/pull/8851.
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.
We are waiting on merge of PR: https://github.com/fluent/fluent-bit/pull/8851.
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.
This issue was closed because it has been stalled for 5 days with no activity.
@ensean Hey, just wanted to thank you for this
Confirmed that the issue exists for the following env.
OS: Amazon Linux 2023 arm fluent bit: [fluent bit] version=3.0.3 BTW, if I change the os arch to x86, the issue disappeared...
I was going crazy with Fluenbit escaping my emojis and redeploying my app using x86 fixed the issue.