ThreatIngestor
ThreatIngestor copied to clipboard
Allow for scraping based ingestion of blogs without RSS feeds.
An increasing trend we're seeing is for folks to forego RSS feeds on their blogs. To capture these sources, a general web scraping approach must be used. I propose we allow for the definition of a URL regex that will be leveraged for scraping with state detection for previously scraped blogs. A list of blogs to test this against include:
cisecurity.org, elastic, flashpoint.io, palo, proofpoint, recordedfuture, redcanary, secureworks, securityintelligence.com, splunk, zscaler