logstash-filter-grok icon indicating copy to clipboard operation
logstash-filter-grok copied to clipboard

Strange behavior of grok pattern

Open jordansissel opened this issue 10 years ago • 0 comments

(This issue was originally filed by @fervid at https://github.com/elastic/logstash/issues/2368)


Hello.

I wrote pattern to parse follow ISO8601 timestamp: 2015-01-15 06:33:09 +0000 I am going to use (?x) mode, that's why I use explicit SPACE.

pattern file:

SPACE \s
#FIRST CASE
#ISO8601_TIMESTAMP   %{ISO8601_DATE}  (?: [tT] | %{SPACE})   %{ISO8601_TIME}  %{SPACE}  (?: %{ISO8601_TZD_CODE:start_tzd_code} | %{ISO8601_TZD_SIGN:sign} %{HOUR:start_tzd_hour} (?: :? %{MINUTE:start_tzd_minute})?)
#SECOND CASE
#ISO8601_TIMESTAMP   %{ISO8601_DATE}  (?: [tT] | %{SPACE})   %{ISO8601_TIME}  %{SPACE}  %{ISO8601_TIMEZONE}
ISO8601_DATE        %{YEAR:start_year} \- %{MONTHNUM:start_month} \- %{MONTHDAY:start_day}
ISO8601_TIME        %{HOUR:start_hour}  :?  %{MINUTE:start_minute}  (?: :? %{SECOND:start_second})?
ISO8601_TZD_SIGN    [+-]
ISO8601_TZD_CODE    [zZ]
ISO8601_TIMEZONE    (?: %{ISO8601_TZD_CODE:start_tzd_code} | %{ISO8601_TZD_SIGN:sign} %{HOUR:start_tzd_hour} (?: :? %{MINUTE:start_tzd_minute})?)

I have two cases, all they marked them by comment. These two cases behave differently!!! FIRST CASE:

/opt/logstash-1.4.2/bin/logstash -e 'input {stdin {}} filter{ grok { match =\> [ "message", "(?x)%{ISO8601_TIMESTAMP}" ] }} output { stdout { codec => rubydebug }}'
2015-01-15 06:33:09 +0000
{
             "message" => "2015-01-15 06:33:09 +0000",
            "@version" => "1",
          "@timestamp" => "2015-01-18T12:16:01.108Z",
                "host" => "alerts-db",
          "start_year" => "2015",
         "start_month" => "01",
           "start_day" => "15",
          "start_hour" => "06",
        "start_minute" => "33",
        "start_second" => "09",
                "sign" => "+",
      "start_tzd_hour" => "00",
    "start_tzd_minute" => "00"
}

SECOND CASE:

/opt/logstash-1.4.2/bin/logstash -e 'input {stdin {}} filter{ grok { match =\> [ "message", "(?x)%{ISO8601_TIMESTAMP}" ] }} output { stdout { codec => rubydebug }}'
2015-01-15 06:33:09 +0000
{
         "message" => "2015-01-15 06:33:09 +0000",
        "@version" => "1",
      "@timestamp" => "2015-01-18T12:17:09.488Z",
            "host" => "alerts-db",
      "start_year" => "2015",
     "start_month" => "01",
       "start_day" => "15",
      "start_hour" => "06",
    "start_minute" => "33",
    "start_second" => "09"
}

Why in second case I don't get

                "sign" => "+",
      "start_tzd_hour" => "00",
    "start_tzd_minute" => "00"
?

Where is my mistake?

jordansissel avatar May 18 '15 06:05 jordansissel