liblognorm icon indicating copy to clipboard operation
liblognorm copied to clipboard

CEF parser truncates first character from first extension name

Open klasen opened this issue 1 year ago • 2 comments

If there is no space between the last header delimter | and the first extension key name, the lognorm swallows the first character of the key.

input (sample from ArcSight Common Event Format (CEF) - Version 26):

CEF:0|Security|threatmanager|1.0|100|worm successfully stopped|10|src=10.0.0.1 dst=2.1.2.2 spt=1232

actual result rc:

{
  "f": {
    "DeviceVendor": "Security",
    "DeviceProduct": "threatmanager",
    "DeviceVersion": "1.0",
    "SignatureID": "100",
    "Name": "worm successfully stopped",
    "Severity": "10",
    "Extensions": {
      "rc": "10.0.0.1",
      "dst": "2.1.2.2",
      "spt": "1232"
    }
  }
}

expected result src:

{
  "f": {
    "DeviceVendor": "Security",
    "DeviceProduct": "threatmanager",
    "DeviceVersion": "1.0",
    "SignatureID": "100",
    "Name": "worm successfully stopped",
    "Severity": "10",
    "Extensions": {
      "src": "10.0.0.1",
      "dst": "2.1.2.2",
      "spt": "1232"
    }
  }
}

log:

lognormalizer version: 2.0.7.master
liblognorm version: 2.0.4
        advanced stats: available

liblognorm: loading rulebase file 'cef.rb'
liblognorm: rulebase version is 2

liblognorm: read rulebase line[~2]: 'rule=:%{"name":"f", "type":"cef"}%'
liblognorm: rule line to add: ':%{"name":"f", "type":"cef"}%'
liblognorm: addSampToTree 0 of 28
liblognorm: parsed literal: ''
liblognorm: ln_pdagAddParserInternal: { "name": "f", "type": "cef" }
liblognorm: ln_pdagAddParserInstance: { "name": "f", "type": "cef" }, nextnode (nil)
liblognorm: assigned priority is 30000
liblognorm: pdag: 0x562679102330, parser 0x562679103f80
liblognorm: parsed literal: ''
liblognorm: end addSampToTree 28 of 28
liblognorm: optimizing main pdag component
liblognorm: pre sort, parser 0:f[7680004]
liblognorm: post sort, parser 0:f[7680004]
liblognorm: optimizing 0x562679103ff0: field 0 type 'cef', name 'f': 'UNKNOWN':
liblognorm: finished optimizing main pdag component
liblognorm: ---AFTER OPTIMIZATION------------------
liblognorm: MAIN COMPONENT:
liblognorm: subDAG 0x562679102330 (children: 1 parsers, ref 1) [called 0, backtracked 0]
liblognorm: field type 'cef', name 'f': 'UNKNOWN': called 0
liblognorm: field type 'cef', name 'f': 'UNKNOWN':
liblognorm:   subDAG [TERM] 0x562679103ff0 (children: 0 parsers, ref 1) [called 0, backtracked 0]
liblognorm: MAIN COMPONENT (alternative):
liblognorm: 0x562679102330[ref 1]:
liblognorm:   0x562679103ff0[ref 1]: %f:cef%
liblognorm: =======================================
number of tree nodes: 2
liblognorm: MAIN COMPONENT:
liblognorm: subDAG 0x562679102330 (children: 1 parsers, ref 1) [called 0, backtracked 0]
liblognorm: field type 'cef', name 'f': 'UNKNOWN': called 0
liblognorm: field type 'cef', name 'f': 'UNKNOWN':
liblognorm:   subDAG [TERM] 0x562679103ff0 (children: 0 parsers, ref 1) [called 0, backtracked 0]
liblognorm: MAIN COMPONENT (alternative):
liblognorm: 0x562679102330[ref 1]:
liblognorm:   0x562679103ff0[ref 1]: %f:cef%
To normalize: 'CEF:0|Security|threatmanager|1.0|100|worm successfully stopped|10|src=10.0.0.1 dst=2.1.2.2 spt=1232'
liblognorm: 0: enter parser, dag node 0x562679102330, json 0x562679102610
liblognorm: 0/0:trying 'cef' parser for field 'f', data 'UNKNOWN'
liblognorm: parser lookup returns 0, pParsed 99
liblognorm: 0: potential hit, trying subtree 0x562679103ff0
liblognorm: 99: enter parser, dag node 0x562679103ff0, json 0x562679102610
liblognorm: offs 99, strLen 99, isTerm 1
liblognorm: 99 returns 0, pParsedTo 0, parsedTo 0
liblognorm: 0: subtree returns 0, parsedTo 99
liblognorm: 0: parser matches at 0
liblognorm: parsedTo 99, *pParsedTo 99
liblognorm: offs 0, strLen 99, isTerm 0
liblognorm: 0 returns 0, pParsedTo 99, parsedTo 99
liblognorm: final result for normalizer: parsedTo 99, endNode 0x562679103ff0, isTerminal 1, tagbucket (nil)
liblognorm: DONE, final return is 0
normalized: '{ "f": { "DeviceVendor": "Security", "DeviceProduct": "threatmanager", "DeviceVersion": "1.0", "SignatureID": "100", "Name": "worm successfully stopped", "Severity": "10", "Extensions": { "rc": "10.0.0.1", "dst": "2.1.2.2", "spt": "1232" } } }'
{ "f": { "DeviceVendor": "Security", "DeviceProduct": "threatmanager", "DeviceVersion": "1.0", "SignatureID": "100", "Name": "worm successfully stopped", "Severity": "10", "Extensions": { "rc": "10.0.0.1", "dst": "2.1.2.2", "spt": "1232" } } }
liblognorm: exitCtx 0x5626791022a0
liblognorm: delete 0x562679102330[1]:
liblognorm: delete 0x562679103ff0[1]: %f:cef%

klasen avatar Jul 20 '22 15:07 klasen

Hello, This pull request should fix it but I am not sure it still applies on master because there was another change merged since for another bug in the CEF parser: rsyslog/liblognorm#331

julthomas avatar Aug 24 '22 17:08 julthomas

Just noticed this issue and can confirm it's still present in v2.0.6. E.g.

echo 'CEF:0|Vendor|Product|Version|Signature ID|some name|Severity|aa=field1 bb=this is a value cc=field 3' | lognormalizer -e json -r cef.rulebase | jq

produced a instead of aa for the first field name.

{
  "extra": "",
  "cef": {
    "DeviceVendor": "Vendor",
    "DeviceProduct": "Product",
    "DeviceVersion": "Version",
    "SignatureID": "Signature ID",
    "Name": "some name",
    "Severity": "Severity",
    "Extensions": {
      "a": "field1",
      "bb": "this is a value",
      "cc": "field 3"
    }
  }
}

JPvRiel avatar Jan 27 '23 11:01 JPvRiel