logstash-filter-grok icon indicating copy to clipboard operation
logstash-filter-grok copied to clipboard

FEATURE : Recursive pattern for Grok

Open jordansissel opened this issue 10 years ago • 7 comments

(This issue was originally filed by @M0dM at https://github.com/elastic/logstash/issues/1934)


Hi,

I didn't arrived to use recursivity inside grok custom patterns. I think this could be an awesome feature.

Benoit

Description :

Grok pattern matching the two following lines :

2014-07-11 18:26:21,335 - INFO  - 1712933>-<>-<text1>-<>-<text2

2014-07-11 18:26:21,335 - INFO  - 1712933>-<>-<text1>-<>-<text2>-<>-<text3

I want to match both of the lines and extract data like this :

%{CUSTOM_DATE}[\s-]*%{LOGLEVEL}[\s-]*%{POSINT}%{AMA_VALUES_LIST_DATA}

CUSTOM_DATE %{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{HOUR}:%{MINUTE}:%{SECOND}
CUSTOM_VALUE ((?!>-<>-<).)*
CUSTOM_LIST_VALUE >-<>-<%{CUSTOM_VALUE}
CUSTOM_VALUES_LIST_COMPLEX %{CUSTOM_LIST_VALUE}%           {CUSTOM_LIST_VALUE_COMPLEX} | %{CUSTOM_LIST_VALUE}

What I would like to get :

     {
      "CUSTOM_DATE": [
        [
          "2014-07-11 18:26:21,335"
        ]
      ],
      "YEAR": [
        [
          "2014"
        ]
      ],
      "MONTHNUM": [
        [
          "07"
        ]
      ],
      "MONTHDAY": [
        [
          "11"
        ]
      ],
      "HOUR": [
        [
          "18"
        ]
      ],
      "MINUTE": [
        [
          "26"
        ]
      ],
      "SECOND": [
        [
          "21,335"
        ]
      ],
      "LOGLEVEL": [
        [
          "INFO"
        ]
      ],
      "POSINT": [
        [
          "1712933"
        ]
      ],
      "CUSTOM_LIST_COMPLEX": [
        [
          ">-<>-<text1>-<>-<text2>-<>-<text3"
        ]
      ],
      "CUSTOM_LIST_VALUE": [
        [
          ">-<>-<text1",
          ">-<>-<text2",
          ">-<>-<text3"
        ]
      ]
      "CUSTOM_VALUE": [
        [
          "text1",
          "text2",
          "text3"
        ]
      ]
    }

jordansissel avatar May 18 '15 04:05 jordansissel

I've just seen my mistake in my original post :

Please replace %{AMA_VALUES_LIST_DATA} to %{CUSTOM_VALUES_LIST_COMPLEX} in the above exemple.

M0dM

M0dM avatar May 20 '15 16:05 M0dM

+1

naisanza avatar Jul 16 '15 19:07 naisanza

#50 is a duplicate

jordansissel avatar Aug 07 '15 21:08 jordansissel

Would love to get this feature. I'm trying to parse a log containing specific ORA id's, so each message can contain none to multiple instances of ORA id's, for example ORA-0001 and somewhere else in the log ORA-0203. Doing one regex to match all and add those to a single field would be very valuable.

elvarb avatar Feb 12 '17 21:02 elvarb

+1

santiagovm avatar Nov 22 '17 04:11 santiagovm

hi . i was doing something similar. i have my log something like this fab 20 gds 30 rt 21 i want to create two array : one contating {fab,dgs,rt} and other containing their respective value {20,30,,21} i followed your above approach: this is how i wrote my rules :+1: CUSTOM_VALUE (?:%{NUMBER}) CUSTOM_LIST_VALUE (?:(\s*%{WORD}[\s*]%{CUSTOM_VALUE})) CUSTOM_VALUE_LIST_COMPLEX (?:(%{CUSTOM_LIST_VALUE})+)

and i am matching %{CUSTOM_VALUE_LIST_COMPLEX:category} i am getting whole fab 20 gds 30 rt 21 under CUSTOM_VALUE_LIST_COMPLEX. my question is how to get those values like array as i mentioned above from this CUSTOM_VALUE_LIST_COMPLEX

shivom-25 avatar May 29 '18 11:05 shivom-25

Any update on this @jordansissel? I have a use case for this where multiple values are separated by something like %{DATE}\t%{UUID}\t and defining a recursive pattern would let me parse multiple of these to get an array

pranaygp avatar Feb 13 '20 05:02 pranaygp