devtools
devtools copied to clipboard
Web crawler for reporting
If we can build a web crawler to validate and report back we can provide some major value to the carbon teams. We've already built a web crawler, but the issue I came across was data. I think having Carbon telemetry #11 will help us in this area. Then we need a way to add new links and manage the queue of links to validate.
Assuming IBM.com has 30 million pages and we'd like to roughly check each URL once a month then we need to be able to process roughly 12 URLs per second give or take.
Things to consider...
- [ ] Database need to upload reports #11
- [ ] URL Queue
- [ ] Interface to add and manage URL Queue