fluentd
fluentd copied to clipboard
Td-agent fluentd ruby processess goes into hung state when regex pattern is provided in tail plugin and log messages contains garbage/control (e.g. record separator) characters.
Describe the bug
Any log message which contains garbage/control character like “Record Separator” (which only shows in notepad++, copy paste to other editor does not show those characters, please refer screenshot)
While the same expression and sample data parse/filter the data correctly in https://fluentular.herokuapp.com/ validator correctly (please refer https://fluentular.herokuapp.com/parse?regexp=.%3F%28%3F%3Clevel%3E.%3F%29%5B+%5Ct%5D%2B%28%3F%3Ctime%3E%5Cd%7B1%2C2%7D%5C%2F%5Cd%7B1%2C2%7D%5B+%5Ct%5D%2B%5Cd%7B1%2C2%7D%3A%5Cd%7B1%2C2%7D%3A%5Cd%7B1%2C2%7D%5C.%5Cd%7B1%2C3%7D%29%5B+%5Ct%5D%2B%5C%5B%28%3F%3Cthread%3E.%29%5C%5D%5B+%5Ct%5D%2BUserAccount+Adding+user+with+TenantName+as+%28%3F%3Cusername%3E.%3F%29%5C*&input=INFO++08%2F16+19%3A31%3A39.214+%5Bhttps-jsse-nio-443-exec-3%5D+UserAccount+Adding+user+with+TenantName+as+AAAmonitoring%1E*&time_format=%25m%2F%25d++%25H%3A%25M%3A%25S.%25L).
below is sample log message screenshot
Sample regex: .?(?
Sample data :
INFO 08/31 19:31:39.214 [https-jsse-nio-443-exec-19] UserAccount Adding user with TenantName as XYZ<here_Record separatorappears>*
To Reproduce
Install td-agent on windows and configure td-agent.conf file with tail plugin with type regex. tail the file which contains control characters like record separator. the ruby processes goes into hung state and tail plugin stops working.
Expected behavior
tail plugin should be able to parse the log messages containing control characters and continue tailing. fluentular validator able to parse the message just file, so fluentd as well.
Your Environment
- Fluentd version: 1.15.2
- TD Agent version:4.4.1 (msi installer td-agent-4.4.1-x64.msi)
- Operating system:Windows Server 2012 R2 and 2019
- Kernel version:
Your Configuration
####
## Output descriptions:
##
<source>
@type tail
path C:/logs/UsrAccount_New.log
tag DevOpsWinDebug_logingest
read_from_head true
pos_file C:/opt/file-UsrLogs.log.pos
follow_inodes true
refresh_interval 2
enable_stat_watcher false
from_encoding UTF-8
encoding UTF-8
exclude_path [""]
multiline_flush_interval 5s
<parse>
@type regexp
expression /.*?(?<level>.*?)[ \t]+(?<time>\d{1,2}\/\d{1,2}[ \t]+\d{1,2}:\d{1,2}:\d{1,2}\.\d{1,3})[ \t]+\[(?<thread>.*)\][ \t]+UserAccount Adding user with TenantName as (?<username>.*?)\*/
time_format %m/%d %H:%M:%S.%L
time_key time
</parse>
</source>
<match DevOpsWinDebug_logingest>
@type stdout
</match>
## match tag=debug.** and dump to console
<match DevOpsWinDebug_logingest>
@type stdout
@id output_stdout
</match>
####
## Source descriptions:
##
## built-in TCP input
## @see http://docs.fluentd.org/articles/in_forward
<source>
@type forward
@id input_forward
</source>
## built-in UNIX socket input
#<source>
# type unix
#</source>
# HTTP input
# POST http://localhost:8888/<tag>?json=<json>
# POST http://localhost:8888/td.myapp.login?json={"user"%3A"me"}
# @see http://docs.fluentd.org/articles/in_http
<source>
@type http
@id input_http
port 8888
</source>
## live debugging agent
<source>
@type debug_agent
@id input_debug_agent
bind 127.0.0.1
port 24230
</source>
Your Error Log
No errors , ruby process just goes into hung state and forcefully kill the ruby processes to stop.
2022-09-01 01:11:01 -1200 [info]: starting fluentd-1.15.2 pid=3776 ruby="2.7.6"
2022-09-01 01:11:01 -1200 [info]: spawn command to main: cmdline=["C:/opt/td-agent/bin/ruby.exe", "-Eascii-8bit:ascii-8bit", "C:/opt/td-agent/bin/fluentd", "--under-supervisor"]
2022-09-01 01:11:03 -1200 [info]: adding match pattern="DevOpsWinDebug_logingest" type="stdout"
2022-09-01 01:11:03 -1200 [info]: adding match pattern="DevOpsWinDebug_logingest" type="stdout"
2022-09-01 01:11:03 -1200 [info]: adding source type="tail"
2022-09-01 01:11:03 -1200 [warn]: #0 'encoding' and 'from_encoding' are same encoding. No effect
2022-09-01 01:11:03 -1200 [info]: adding source type="forward"
2022-09-01 01:11:03 -1200 [info]: adding source type="http"
2022-09-01 01:11:03 -1200 [info]: adding source type="debug_agent"
2022-09-01 01:11:03 -1200 [info]: #0 starting fluentd worker pid=3100 ppid=3776 worker=0
2022-09-01 01:11:03 -1200 [info]: #0 [input_debug_agent] listening dRuby uri="druby://127.0.0.1:24230" object="Fluent::Engine" worker=0
2022-09-01 01:11:03 -1200 [info]: #0 [input_forward] listening port port=24224 bind="0.0.0.0"
2022-09-01 01:11:03 -1200 [info]: #0 following tail of C:/logs/UsrAccount_New.log
2022-09-01 01:11:03 -1200 [warn]: #0 pattern not matched: "INFO 08/25 06:21:08.252 [https-jsse-nio-443-exec-14] UserAccount Connection Retry Count set for GET 3"
2022-09-01 01:11:03 -1200 [warn]: #0 pattern not matched: "INFO 08/25 06:21:08.534 [https-jsse-nio-443-exec-8] UserAccount Connection Retry Count set for GET 3"
Additional context
The td-agent just works fine with parse @type none but displays ASCII char of record separator -- 2022-09-01 00:25:26.854032200 -1200 DevOpsWinDebug_logingest: {"message":"INFO 08/25 10:12:55.799 [https-jsse-nio-443-exec-11] UserAccount Adding user with TenantName as monitoring**\u001e***"}