crusty-core
crusty-core copied to clipboard
A small library for building fast and highly customizable web crawlers
Job cannot be considered completed OK when root task got cooked by some error,
We could simplify && speed things up => spawn several Crawlers in their own threads, we already handle job delegation via channels this way we have less internal/external Send/Sync restrictions...
For broad web crawling we probably do not need any concurrency within a single job... which means we can save up a bunch of resources and annoy site owners less......
Built on top of `StaticAsyncResolver` trying to resolve DNS by sending request and awaiting response on a channel, within timeout.
https://github.com/fitzgen/bumpalo
When server returns 5xx or 429(too many requests) we should slow down