ThreatIngestor icon indicating copy to clipboard operation
ThreatIngestor copied to clipboard

Allow for scraping based ingestion of blogs without RSS feeds.

Open pedramamini opened this issue 2 years ago • 0 comments

An increasing trend we're seeing is for folks to forego RSS feeds on their blogs. To capture these sources, a general web scraping approach must be used. I propose we allow for the definition of a URL regex that will be leveraged for scraping with state detection for previously scraped blogs. A list of blogs to test this against include:

cisecurity.org, elastic, flashpoint.io, palo, proofpoint, recordedfuture, redcanary, secureworks, securityintelligence.com, splunk, zscaler

pedramamini avatar Sep 13 '22 20:09 pedramamini