crawl4ai
crawl4ai copied to clipboard
Detect different failures cases
It would be useful as well to detect pages with:
- Captcha
- Bot wall
- Please update your browser
- Parked domain
- ...
Those errors are already handled if I'm not wrong :
- HTTP errors (4xx, 5xx)
- TLS issue