logstash-filter-grok
logstash-filter-grok copied to clipboard
Strange behavior of grok pattern
(This issue was originally filed by @fervid at https://github.com/elastic/logstash/issues/2368)
Hello.
I wrote pattern to parse follow ISO8601 timestamp: 2015-01-15 06:33:09 +0000 I am going to use (?x) mode, that's why I use explicit SPACE.
pattern file:
SPACE \s
#FIRST CASE
#ISO8601_TIMESTAMP %{ISO8601_DATE} (?: [tT] | %{SPACE}) %{ISO8601_TIME} %{SPACE} (?: %{ISO8601_TZD_CODE:start_tzd_code} | %{ISO8601_TZD_SIGN:sign} %{HOUR:start_tzd_hour} (?: :? %{MINUTE:start_tzd_minute})?)
#SECOND CASE
#ISO8601_TIMESTAMP %{ISO8601_DATE} (?: [tT] | %{SPACE}) %{ISO8601_TIME} %{SPACE} %{ISO8601_TIMEZONE}
ISO8601_DATE %{YEAR:start_year} \- %{MONTHNUM:start_month} \- %{MONTHDAY:start_day}
ISO8601_TIME %{HOUR:start_hour} :? %{MINUTE:start_minute} (?: :? %{SECOND:start_second})?
ISO8601_TZD_SIGN [+-]
ISO8601_TZD_CODE [zZ]
ISO8601_TIMEZONE (?: %{ISO8601_TZD_CODE:start_tzd_code} | %{ISO8601_TZD_SIGN:sign} %{HOUR:start_tzd_hour} (?: :? %{MINUTE:start_tzd_minute})?)
I have two cases, all they marked them by comment. These two cases behave differently!!! FIRST CASE:
/opt/logstash-1.4.2/bin/logstash -e 'input {stdin {}} filter{ grok { match =\> [ "message", "(?x)%{ISO8601_TIMESTAMP}" ] }} output { stdout { codec => rubydebug }}'
2015-01-15 06:33:09 +0000
{
"message" => "2015-01-15 06:33:09 +0000",
"@version" => "1",
"@timestamp" => "2015-01-18T12:16:01.108Z",
"host" => "alerts-db",
"start_year" => "2015",
"start_month" => "01",
"start_day" => "15",
"start_hour" => "06",
"start_minute" => "33",
"start_second" => "09",
"sign" => "+",
"start_tzd_hour" => "00",
"start_tzd_minute" => "00"
}
SECOND CASE:
/opt/logstash-1.4.2/bin/logstash -e 'input {stdin {}} filter{ grok { match =\> [ "message", "(?x)%{ISO8601_TIMESTAMP}" ] }} output { stdout { codec => rubydebug }}'
2015-01-15 06:33:09 +0000
{
"message" => "2015-01-15 06:33:09 +0000",
"@version" => "1",
"@timestamp" => "2015-01-18T12:17:09.488Z",
"host" => "alerts-db",
"start_year" => "2015",
"start_month" => "01",
"start_day" => "15",
"start_hour" => "06",
"start_minute" => "33",
"start_second" => "09"
}
Why in second case I don't get
"sign" => "+",
"start_tzd_hour" => "00",
"start_tzd_minute" => "00"
?
Where is my mistake?