kestrel-lang
kestrel-lang copied to clipboard
FIND will not recreate variable but append
Describe the bug When executing the FIND command on the same return variable, the variable is not recreated, but outputs are appended.
Details of the bug
- What is the hunt flow/script you are executing?
nt = get network-traffic
from file:///tmp/d.json
where [network-traffic:src_port > 0]
p = FIND process CREATED nt
- What is the command that failed? The
FINDcommand, when executing multiple times, will getpto be larger and larger.
To Reproduce Steps to reproduce the behavior:
- data source: https://github.com/opencybersecurityalliance/kestrel-lang/blob/develop/tests/doctored-1k.json
- run hunt flow as above
Expected behavior
p should not change even the FIND command has been executed multiple times.
Environment (please complete the following information):
- OS: Fedora 34
- Python version: Python 3.9.12
This is caused by prefetch + the non-deterministic process id.
Prefetch will query the original data source, e.g., the stix-bundle in this case. The prefetch query will load the bundle again. Without a way to deterministically generate process id, all records will be reloaded as different records in firepit process table and __reflist table. This causes the issue to double the size of the query results then.
Will postpone the fix until we have better process id generation, which is not easy. For static stix bundle, it is feasible, but for stix-shifter created results, the observation id is different for each query though the results could point to the same. we need better support from stix-shifter to handle it.
Related: https://github.com/opencybersecurityalliance/stix-shifter/issues/922