Add field indicating if the request was a success
To get records with both null, 404 and 5xx HTTP status, an internal success flag in the results would allow us to filter by select(.success!=true).
This is still relevant. Here is a list of URLs which have "status": null, yet the link doesn't work. Links are grouped by their exception message:
-
Connection to ... failedhttp://byerly.org/bt.htm http://fashion.dior.com/dior.html http://www.allsands.com -
Crypto negotiation failed: Connection reset by peerhttp://www.libertyellisfoundation.org/immigration-museum http://www.rspca.org.uk -
Crypto negotiation failed: stream_socket_enable_crypto(): SSL operation failed with code 1. OpenSSL Error messages:\nerror:1414D172:SSL routines:tls12_check_peer_sigalg:wrong signature typehttp://www.kristeligt-dagblad.dk/liv-sjael/sociale-medier-goer-sorgen-nemmere http://www.echr.coe.int/echr http://www.portugalvirtual.pt -
Request must specify a valid HTTP URIhttp://www.cwc.lsu.edu. http://www.newscientist.com, -
Resolving the specified domain failedhttp://awalls.org http://curia.eu.int/da/index.htm http://europa.eu.int
I have found this solution for filtering on multiple status values:
cat linkreport.json | jq -c '. | select([.status] | inside([301, 302, 400, 401, 403, 404, 406, 410, 416, 500, 502, 503, 504])) | {page: .referrer, link_text: .referrer_title, dead_link: .url}' | jq '.'