halt processing if n subsequent/initial pages fail to return valid response
ie, just debugged issue where:
WP URLs are mydomain.com
but was getting curl errors (not properly surfaced) of unable to access, same with wget mydomain.com from the host.
adding WP site URL (mydomain.com) to /etc/hosts resolves the issue, but need to catch this earlier to save headaches and lengthy debugs...
maybe do a "test crawl" of homepage first and fail quickly.
this may be better handled as "if n first or subsequent crawled files are 4/5xx status, throw an error and stop things"
Failing on a single error on a site with only a small % of inaccessible pages would not be great. capturing errors and reporting on them is more helpful.