sigma-specification icon indicating copy to clipboard operation
sigma-specification copied to clipboard

Regular Expression matching

Open maederm opened this issue 3 years ago • 5 comments

Hi

How does sigma expect regex to be applied to fields? Does the regex need to apply to the whole field? I couldn't find a definition in the spec.

Take for example rules/windows/process_creation/win_regini.yml

        CommandLine|re: ':[^ \\]' # to avoid intersection with ADS rule

If I translate that with sigmac I'll get a query string that requries a full match on the field.

$ tools/sigmac  rules/windows/process_creation/win_regini_ads.yml -c winlogbeat-modules-enabled -t es-qs
(process.executable.keyword:*\\regini.exe AND process.command_line.keyword:/:[^ \\]/)

I propose to define that behavior in the sigma specification and thought of these two possibilities:

Solution A: Sigma Spec defines partial match

If only a partial match is required I can try to make a pull request that would translate it to (process.executable.keyword:*\\regini.exe AND process.command_line.keyword:/.*:[^ \\].*/)

Solution B: Sigma Spec defines full match

If a full field match is required I could make a pull request to rewrite the rule to

        CommandLine|re: '.*:[^ \\].*' # to avoid intersection with ADS rule

Best Regards, maederm

maederm avatar Jun 17 '21 14:06 maederm

Hi, The modifier re check if it is a valid regex and give it to the backend. Not every backend can handle regex. Some have they way to deal with regex :

A way can be to have 2 modifiers:

  • |re (no change)
  • |re_in ( backend add .* or what it is need to work)

frack113 avatar Jul 03 '21 07:07 frack113

A way can be to have 2 modifiers:

* |re       (no change)

* |re_in   ( backend add `.*` or what it is need to work)

@frack113: If I understand you correctly this means the spec should be updated to say that |re must match the whole field, right?

maederm avatar Jul 05 '21 07:07 maederm

currently it is the backend that manages the regex. So the way es-qs manages it is a full match because elactic is fullmatch. Test in Kibana

  • Event.Image:/.*\.exe/ OK
  • Event.Image:/.*\.exe$/ NOK
  • Event.Image:/\.exe/ NOK

My proposal is to clarify this point. So the author specifies in the search his regex is full or partial ,but the backend still has to handle it ... in my mind re_in is like contains perhaps more re_contains

frack113 avatar Jul 05 '21 08:07 frack113

As field matches are always full matches on the whole value, this should be the same for regular expressions to maintain consistency.

thomaspatzke avatar Dec 20 '22 22:12 thomaspatzke

@thomaspatzke I'm not sure it's officially in the specification, but I disagree with your comment. Full-matching regexes can have important performance implications for SIEMs:

It is discussed here https://www.loggly.com/blog/five-invaluable-techniques-to-improve-regex-performance/ image

At my org, using leading and trailing .*s in use cases is only used when absolutely necessary, as a bad regex that's ran on 20k events per second can have very negative performance impacts!

Res260 avatar Sep 16 '23 01:09 Res260