broken-link-checker icon indicating copy to clipboard operation
broken-link-checker copied to clipboard

Feature Request: Less Verbose Options (Broken-Only, 404-Only, etc)

Open Ravlen opened this issue 6 years ago • 9 comments

I find BLC to be extremely useful, but the output has too much information (I'm betting the majority of users are looking for the BROKEN links, not the OK links). It would be great to have a CLI option for outputting only broken links, or only certain types of errors (404, 403, etc), and any page with 0 broken links would have nothing output at all.

For example I get output like this now:

Getting links from: https://www.example.com/archives/
├───OK─── https://docs.example.com/install/
├─BROKEN─ https://docs.example.com/archives.html (HTTP_404)
├───OK─── https://example.com/doc
├───OK─── https://example.com/docs
├───OK─── https://example.com/docs/archives
├───OK─── https://example.com/content/archives.html
├───OK─── https://example.com/example-docs/
└───OK─── https://example.com/master/doc
Finished! 88 links found. 80 excluded. 1 broken.

Getting links from: https://docs.example.com/ssh/
├───OK─── https://the.earth.li/%7Esgtatham/putty/0.67/htmldoc/Chapter8.html#pubkey-puttygen
├───OK─── https://wiki.eclipse.org/EGit/User_Guide#Eclipse_SSH_Configuration
├───OK─── https://www.digitalocean.com/community/tutorials/understanding-the-ssh-encryption-and-connection-process
└───OK─── http://www.chiark.greenend.org.uk/%7Esgtatham/putty/download.html
Finished! 120 links found. 115 excluded. 0 broken.

A --less-verbose flag would output only this (the second link scanned would output nothing since there were no broken links):

Broken link(s) from: https://www.example.com/archives/
└─BROKEN─ https://docs.example.com/archives.html (HTTP_404)

Ravlen avatar Oct 29 '18 04:10 Ravlen

An interim solution is to use a pipe to grep.

blc -r https://www.example.com/archives/ |  grep --color=never -e 'Getting links' -e '404' -e 'Finished!'

tasmo avatar Nov 02 '18 15:11 tasmo

I threw this together - it adds a -q/--quiet flag to only show broken pages & links: https://github.com/alexlouden/broken-link-checker

alexlouden avatar Dec 06 '18 06:12 alexlouden

An interim solution is to use a pipe to grep.

blc -r https://www.example.com/archives/ |  grep --color=never -e 'Getting links' -e '404' -e 'Finished!'

Thanks for this but it's not really that helpful as it shows every page, even if that page has nothing broken so if you've got a broken page in a 1000s pages you have go through a 1000s lines trying to find the one that has the broken link.

Would you except a quiet option patch that only output names if something is broken?

greggman avatar Jan 07 '19 04:01 greggman

Would you except a quiet option patch that only output names if something is broken?

Hey @greggman - I've implemented this in my fork, if you'd like to have a look? https://github.com/alexlouden/broken-link-checker

We're using my version at work in our CI and it makes it a lot clearer to see what's broken

alexlouden avatar Jan 07 '19 04:01 alexlouden

@alexlouden that's great. Have you submitted a PR?

greggman avatar Jan 07 '19 04:01 greggman

Just submitted one @greggman - thanks for the push 😃

alexlouden avatar Jan 08 '19 04:01 alexlouden

@greggman

If you dump tasmo's suggestion above into a text file you can run the following against it to remove the redundant "Getting links from" noise.

sed '/Getting links from/{$!N;/\n.*Getting links from/!P;D}' file

This command will remove a line containing "Getting links from" if it is immediately followed by a line "Getting links from".

jackfoust avatar Mar 28 '19 20:03 jackfoust

@greggman

If you dump tasmo's suggestion above into a text file you can run the following against it to remove the redundant "Getting links from" noise.

sed '/Getting links from/{$!N;/\n.*Getting links from/!P;D}' file

This command will remove a line containing "Getting links from" if it is immediately followed by a line "Getting links from".

Can you update this command to match the new syntax, which includes:

Finished! # links found. # excluded. # broken.

alexfornuto avatar Sep 03 '19 23:09 alexfornuto

Hey @greggman - I've implemented this in my fork, if you'd like to have a look? https://github.com/alexlouden/broken-link-checker

@alexlouden Thanks for your fork! I works well to lower the noise level. I installed it globally with:

npm install git+https://github.com/alexlouden/broken-link-checker -g

frederickjh avatar Mar 26 '21 15:03 frederickjh