bbot
bbot copied to clipboard
BBOT 2.0 URL Excavation TODOs
The following are TODOs for our URL excavation:
- [ ] Unify URL excavation into single excavator (no duplicate code between URL excavator + web param excavator)
- [ ] We have multiple yara rules that extract URLs. These should either be collapsed into a single rule, or else deduped as early as possible (before leaving excavate as events)
- [x] Tests to make sure we're excavating query parameters
- [ ] Tests to make sure we're excavating IPv6 URLs (all possible different formats, as suggested by @colin-stubbs)
"Tests to make sure we're excavating query parameters"
This exists, there are a number of tests with the prefix TestExcavateParameterExtraction that cover this.
As far as the first point, we've discussed this some offline, but i'll summarize a few points for consideration:
- There is very little overlap, really only one YARA rule that crosses over between two. This is because most parameters are extracted in a way that doesn't touch the actual URL at all, for example in forms, in jquery calls, etc.
- Parameter extraction has a lot more complexity, and also isn't on by default. This enables us to skip this complexity when we aren't doing any thing with
WEB_PARAMETER. - It is extremely likely we'd actually add overall complexity by trying to merge the functionality. (As simple as possible URL extraction + As simple as possible Parameter extraction) < Very Complex Combined Extraction
- The YARA rules are all compiled. This means the additional overhead by adding one YARA rule is very small, even if it is doing a very similar thing in one or two cases. The process of compilation minimizes this overhead.
- Clear logical separation. Since URLs go to completely different event types than parameters, and have very different rules, separating their post-processing logic will make everything significantly more maintainable.
- Slowed URL processing. URLs are handled more frequently, and adding parameter logic there means every URL extraction is going to take longer.
- https://github.com/blacklanternsecurity/bbot/issues/1815
closing since only remaining checkbox has a separate issue already