bbot icon indicating copy to clipboard operation
bbot copied to clipboard

Excavate tries to de-duplicate yara matches based on description

Open aconite33 opened this issue 1 year ago • 3 comments

Issue: https://github.com/blacklanternsecurity/bbot/issues/1937

This MR fixes a bug where the excavate module will try to de-duplicate findings when a yara FINDINGs are emitted.

Example:

rule find_string {
	strings:
		$str1 = "Example String"

	condition:
		$str1
}

This will generate a FINDING emit with the following:

[FINDING]               {"description": "Custom Yara Rule [find_string] Matched via identifier [str1]", "host": "example.com", "path": "/", "url": "https://example.com/"}  httpx->excavate

However, if another site matches on the same string, it will not generate a FINDING emit, instead will suppress the emit because the description doesn't have enough uniqueness on the match.

The MR fixes this by adding specific URL where the match was found. Example below:

[FINDING]               {"description": "Custom Yara Rule [find_string] Matched via identifier [str1] on https://example.com/", "host": "example.com", "path": "/", "url": "https://example.com/"}  httpx->excavate

This creates enough uniqueness where the FINDING wont' be suppressed and won't fire if it finds duplicates on the same string, on the same site.

aconite33 avatar Nov 08 '24 19:11 aconite33

Codecov Report

Attention: Patch coverage is 0% with 1 line in your changes missing coverage. Please review.

Project coverage is 93%. Comparing base (65ac448) to head (6b32b0b). Report is 41 commits behind head on dev.

Files with missing lines Patch % Lines
bbot/modules/internal/excavate.py 0% 1 Missing :warning:
Additional details and impacted files
@@          Coverage Diff          @@
##             dev   #1938   +/-   ##
=====================================
+ Coverage     93%     93%   +1%     
=====================================
  Files        361     361           
  Lines      27773   27774    +1     
=====================================
+ Hits       25588   25591    +3     
+ Misses      2185    2183    -2     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Nov 08 '24 22:11 codecov[bot]

It seems like this would cause an error when processing events where .data isn't a dictionary, like RAW_TEXT.

Also before we merge this we need to understand why FINDINGs with different hosts are being deduped. If that is the behavior we're seeing, then that's definitely strange and probably the result of a deeper bug.

TheTechromancer avatar Nov 08 '24 23:11 TheTechromancer

@aconite33 I will work on tracking this down. Do you have an exact yara rule + bbot command that can reproduce the bug?

TheTechromancer avatar Nov 09 '24 01:11 TheTechromancer

Superceded by https://github.com/blacklanternsecurity/bbot/pull/1969; thanks @aconite33 for noticing this one

TheTechromancer avatar Nov 16 '24 03:11 TheTechromancer