fluent-plugin-record-modifier
fluent-plugin-record-modifier copied to clipboard
incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string)
Problem
I'm getting below error while shipping logs to ES via td-agent 1.11.1:
2020-11-01 17:11:42 +0530 [error]: #0 incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string)
2020-11-01 17:11:42 +0530 [error]: #0 suppressed same stacktrace
2020-11-01 17:11:42 +0530 [error]: #0 incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string)
2020-11-01 17:11:42 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/parser_regexp.rb:50:in `match'
2020-11-01 17:11:42 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/parser_regexp.rb:50:in `parse'
2020-11-01 17:11:42 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluent-plugin-grok-parser-2.6.1/lib/fluent/plugin/parser_multiline_grok.rb:21:in `block in parse'
2020-11-01 17:11:42 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluent-plugin-grok-parser-2.6.1/lib/fluent/plugin/parser_multiline_grok.rb:20:in `each'
2020-11-01 17:11:42 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluent-plugin-grok-parser-2.6.1/lib/fluent/plugin/parser_multiline_grok.rb:20:in `parse'
2020-11-01 17:11:42 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:546:in `block in parse_multilines'
2020-11-01 17:11:42 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:544:in `each'
2020-11-01 17:11:42 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:544:in `parse_multilines'
2020-11-01 17:11:42 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:469:in `call'
2020-11-01 17:11:42 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:469:in `receive_lines'
2020-11-01 17:11:42 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:845:in `block in handle_notify'
2020-11-01 17:11:42 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:877:in `with_io'
2020-11-01 17:11:42 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:825:in `handle_notify'
2020-11-01 17:11:42 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:808:in `block in on_notify'
2020-11-01 17:11:42 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:808:in `synchronize'
2020-11-01 17:11:42 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:808:in `on_notify'
2020-11-01 17:11:42 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:653:in `on_notify'
2020-11-01 17:11:42 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:325:in `block in setup_watcher'
2020-11-01 17:11:42 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:596:in `on_timer'
2020-11-01 17:11:42 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/cool.io-1.6.0/lib/cool.io/loop.rb:88:in `run_once'
2020-11-01 17:11:42 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/cool.io-1.6.0/lib/cool.io/loop.rb:88:in `run'
2020-11-01 17:11:42 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin_helper/event_loop.rb:93:in `block in start'
2020-11-01 17:11:42 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
2020-11-01 17:11:43 +0530 [error]: #0 incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string)
2020-11-01 17:11:43 +0530 [error]: #0 suppressed same stacktrace
I've added the parameter suggested here :+1: https://github.com/repeatedly/fluent-plugin-record-modifier#char_encoding as it was recommended here https://docs.fluentd.org/quickstart/faq but the issue persists.
...
Steps to replicate
Provide example config and message
# encoding: utf-8
<source>
@type tail
path /var/log/messages
pos_file /etc/td-agent/new_var_log_msg_grok.log.pos
#time_format %Y-%m-%dT%H:%M:%S.%NZ
time_format %b %dT%H:%M:%SZ
tag var.msg
<parse>
@type multiline_grok
<grok>
pattern %{SYSLOGTIMESTAMP:time}%{SPACE}%{HOSTNAME:hostname}%{SPACE}%{GREEDYDATA:service_name}:%{GREEDYDATA:log_message}
</grok>
</parse>
</source>
<filter var.msg>
@type record_modifier
<record>
hostname "#{Socket.gethostname}"
formatted_time ${Time.at(time).iso8601(3)}
char_encoding utf-8
char_encoding utf-8:euc-jp
</record>
</filter>
<match var.msg>
@type elasticsearch
# type_name "_doc"
hosts redacted:9200
scheme "https"
ssl_version TLSv1_2
ssl_verify false
ca_file "/etc/td-agent/cert.crt"
user redacted
password redacted
reload_connections false
reconnect_on_error true
reload_on_failure true
log_es_400_reason false
logstash_prefix messages_logs
logstash_format true
logstash_dateformat %V
index_name "messages_logs"
type_name "fluentd"
include_timestamp true
<buffer>
@type file
path /etc/td-agent/messages/buffers
chunk_limit_size 1M
flush_interval 5s
retry_forever false
retry_max_times 3
retry_wait 10
retry_max_interval 300
flush_thread_count 8
</buffer>
</match>
`
Expected Behavior or What you need to ask
The same config is working fine for most servers even without char_encoding parameter. Td-agent of same version should have same behaviour across servers with same configuration. The error should go after adding the encoding parameter. ...
Using Fluentd and ES plugin versions
-
OS version Red Hat Enterprise Linux Server release 7.9 (Maipo)
-
Fluentd v0.12 or v0.14/v1.0
td-agent --version
td-agent 1.11.1
-
ES plugin 3.x.y/2.x.y or 1.x.y
- paste result of
fluent-gem list
,td-agent-gem list
or your Gemfile.lock
- paste result of
td-agent-gem list
*** LOCAL GEMS ***
addressable (2.7.0)
async (1.26.2)
async-http (0.52.4)
async-io (1.30.0)
async-pool (0.3.2)
aws-eventstream (1.1.0)
aws-partitions (1.337.0)
aws-sdk-core (3.102.1)
aws-sdk-kms (1.35.0)
aws-sdk-s3 (1.72.0)
aws-sdk-sqs (1.29.0)
aws-sigv4 (1.2.1)
benchmark (default: 0.1.0)
bigdecimal (default: 2.0.0)
bundler (2.1.4)
cgi (default: 0.1.0)
concurrent-ruby (1.1.6)
console (1.8.2)
cool.io (1.6.0)
csv (default: 3.1.2)
date (default: 3.0.0)
delegate (default: 0.1.0)
did_you_mean (default: 1.4.0)
digest-crc (0.6.1)
elasticsearch (7.8.0)
elasticsearch-api (7.8.0)
elasticsearch-transport (7.8.0)
elasticsearch-xpack (7.9.0)
etc (default: 1.1.0)
excon (0.75.0)
faraday (1.0.1)
fcntl (default: 1.0.0)
ffi (1.13.1)
fiddle (default: 1.0.0)
fileutils (default: 1.4.1)
fluent-config-regexp-type (1.0.0)
fluent-logger (0.8.2)
fluent-plugin-concat (2.4.0)
fluent-plugin-elasticsearch (4.1.1, 4.0.9)
fluent-plugin-grok-parser (2.6.1)
fluent-plugin-kafka (0.13.0)
fluent-plugin-prometheus (1.8.0)
fluent-plugin-prometheus_pushgateway (0.0.2)
fluent-plugin-record-modifier (2.1.0)
fluent-plugin-rewrite-tag-filter (2.3.0)
fluent-plugin-s3 (1.3.3)
fluent-plugin-systemd (1.0.2)
fluent-plugin-td (1.1.0)
fluent-plugin-td-monitoring (1.0.0)
fluent-plugin-webhdfs (1.2.5)
fluentd (1.11.1)
forwardable (default: 1.3.1)
getoptlong (default: 0.1.0)
hirb (0.7.3)
http_parser.rb (0.6.0)
httpclient (2.8.2.4)
io-console (default: 0.5.6)
ipaddr (default: 1.2.2)
ipaddress (0.8.3)
irb (default: 1.2.3)
jmespath (1.4.0)
json (default: 2.3.0)
logger (default: 1.4.2)
ltsv (0.1.2)
matrix (default: 0.2.0)
mini_portile2 (2.5.0)
minitest (5.13.0)
mixlib-cli (1.7.0)
mixlib-config (2.2.3)
mixlib-log (1.7.1)
mixlib-shellout (2.2.7)
msgpack (1.3.3)
multi_json (1.14.1)
multipart-post (2.1.1)
mutex_m (default: 0.1.0)
net-pop (default: 0.1.0)
net-smtp (default: 0.1.0)
net-telnet (0.2.0)
nio4r (2.5.2)
nokogiri (1.11.0.rc2)
observer (default: 0.1.0)
ohai (6.20.0)
oj (3.10.6)
open3 (default: 0.1.0)
openssl (default: 2.1.2)
ostruct (default: 0.2.0)
parallel (1.19.2)
power_assert (1.1.7)
prime (default: 0.1.1)
prometheus-client (0.9.0)
protocol-hpack (1.4.2)
protocol-http (0.20.0)
protocol-http1 (0.13.0)
protocol-http2 (0.14.0)
pstore (default: 0.1.0)
psych (default: 3.1.0)
public_suffix (4.0.5)
quantile (0.2.1)
racc (default: 1.4.16)
rake (13.0.1)
rdkafka (0.8.0)
rdoc (default: 6.2.1)
readline (default: 0.0.2)
readline-ext (default: 0.1.0)
reline (default: 0.1.3)
rexml (default: 3.2.3)
rss (default: 0.2.8)
ruby-kafka (1.1.0)
ruby-progressbar (1.10.1)
rubyzip (1.3.0)
sdbm (default: 1.0.0)
serverengine (2.2.1)
sigdump (0.2.4)
singleton (default: 0.1.0)
stringio (default: 0.1.0)
strptime (0.2.4)
strscan (default: 1.0.3)
systemd-journal (1.3.3)
systemu (2.5.2)
td (0.16.9)
td-client (1.0.7)
td-logger (0.3.27)
test-unit (3.3.4)
timeout (default: 0.1.0)
timers (4.3.0)
tracer (default: 0.1.0)
tzinfo (2.0.2)
tzinfo-data (1.2020.1)
uri (default: 0.10.0)
webhdfs (0.9.0)
webrick (default: 1.6.0)
xmlrpc (0.3.0)
yajl-ruby (1.4.1)
yaml (default: 0.1.0)
zip-zip (0.3)
zlib (default: 1.1.0)
- ES version (optional) 7.5.1