bbot
bbot copied to clipboard
Ensure web_spider_distance is only incremented in certain places
Incrementing web_spider_distance in every URL_UNVERIFIED can lead to premature termination of a discovery chain. We should only be incrementing it in cases where the creation of the event is legitimately dangerous.
Note: right now there's an exception for redirects in excavate. We need to make sure to rip this out if we implement this change.
Is spider-danger a decision that should be made by individual modules on a case-by-base basis? Or should we try and centralize it? 🤔
Actually I think the right thing to do here is to switch where we're incrementing web_spider_distance and where we're tagging spider-danger. Right now, we're incrementing web_spider_distance in every URL_UNVERIFIED and manually tagging with spider-danger only in a few individual cases. We should flip that so that in the individual cases where a visit to a URL has spider potential, we don't have to do anything except increment the spider distance. Then, in the event, we can have logic that automatically evaluates the spider danger based on your scan config, and tags the event accordingly.