detection-rules icon indicating copy to clipboard operation
detection-rules copied to clipboard

[Bug] KQL fails to parse brackets and wildcards correctly

Open saiiman opened this issue 1 year ago • 0 comments

Hello,

Describe the bug when parsing a KQL query to a DSL query using the converter in lib/kql, I noticed that it makes two conversion errors.

  1. For a wildcard the keyword wildcard is not used but query_string. (I'm not sure if this is intentional, though. Otherwise the query did not work for me).
  2. Brackets in the KQL-query, in order to map certain conditions, are converted incorrectly.

To Reproduce

query = """
    host.name: \"foo\" 
    and 
    source.ip: \"10.10.0.10.\" 
    and not 
    user.name : bar* 
    and not (
        destination.name : \"some_name\" 
        and 
        destination.ip : \"20.20.0.20\"
    ) 
    and not 
    another.value : \"true\"
"""

print(KqlParser.to_dsl(query))

The current output is

{
  "bool": {
    "filter": [
      {
        "match": {
          "host.name": "foo"
        }
      },
      {
        "match": {
          "source.ip": "10.10.0.10."
        }
      }
    ],
    "must_not": [
      {
        "query_string": {
          "fields": [
            "user.name"
          ],
          "query": "bar*"
        }
      },
      {
        "match": {
          "destination.name": "some_name"
        }
      },
      {
        "match": {
          "destination.ip": "20.20.0.20"
        }
      },
      {
        "match": {
          "another.value": "true"
        }
      }
    ]
  }
}

Expected behavior The expected output is

{
  "bool": {
    "filter": [
      {
        "match": {
          "host.name": "foo"
        }
      },
      {
        "match": {
          "source.ip": "10.10.0.10."
        }
      }
    ],
    "must_not": [
      {
        "wildcard": {             # use wildcard keyword
          "user.name": {
            "value": "bar*"
          }
        }
      },
      {
        "bool": {                 # use nested bool-filter keywords
          "filter": [
            {
              "match": {
                "destination.name": "some_name"
              }
            },
            {
              "match": {
                "destination.ip": "20.20.0.20"
              }
            }
          ]
        }
      },
      {
        "match": {
          "another.value": "true"
        }
      }
    ]
  }
}

suggested solution

  1. for the wildcard keyword, the following code line should be adapted as follows. https://github.com/elastic/detection-rules/blob/0cb42983c1b1788361e69fcaeede80c6c160b5a0/lib/kql/kql/dsl.py#L76
return lambda field: {"wildcard": {field: { "value": tree.value}}}
  1. For the missing brackets, I have traced the transformation back to the following line of code. As a suggestion, I would propose a check for the number of elements within the filter. https://github.com/elastic/detection-rules/blob/0cb42983c1b1788361e69fcaeede80c6c160b5a0/lib/kql/kql/dsl.py#L51
if list(child) == ["bool"] and list(child["bool"]) in (["filter"], ["must"]) and \
            (len(child.get("bool", {}).get("filter", 1)) == 1 and len(child.get("bool", {}).get("must", 1)) == 1):

thank you.

saiiman avatar Apr 07 '24 07:04 saiiman