TheHitchhikersGuidetoDFIRExperiencesFromBeginnersandExperts icon indicating copy to clipboard operation
TheHitchhikersGuidetoDFIRExperiencesFromBeginnersandExperts copied to clipboard

Dead link checker

Open brootware opened this issue 2 years ago • 25 comments

Hey folks, I just took a look at the book repo and I see there tends to be links inside these. As we contribute more contents, more links are going to come in. There’s one technique called markdown link checker with GitHub actions I employed in my one of my own repos to identify dead links. Please let me know if you think this is a good idea

brootware avatar May 08 '22 09:05 brootware

Yes this is a great idea. Can you tell me how to implement this or point me to a repo that has it implemented as an example? GitHub actions are something I'm wanting to learn and this looks like a good one to learn on.

AndrewRathbun avatar May 08 '22 11:05 AndrewRathbun

Hey @AndrewRathbun , sure I have implemented this in my fork of this repo. You can take a look at the results here. There are a couple of dead-links found.

https://github.com/brootware/CrowdsourcedDFIRBook/runs/6344168741?check_suite_focus=true

Some are 503 like below where the endpoint is not available.

ERROR: 1 dead links found!
[155](https://github.com/brootware/CrowdsourcedDFIRBook/runs/6344168741?check_suite_focus=true#step:4:155)
[✖] https://www.exterro.com/ftk-imager#:~:text=FTK%C2%AE%20Imager%20is%20a,(FTK%C2%AE)%20is%20warranted. → Status: 503

And some are 0, but we're still able to access the link.

ERROR: 1 dead links found!
[169](https://github.com/brootware/CrowdsourcedDFIRBook/runs/6344168741?check_suite_focus=true#step:4:169)
[✖] https://developer.android.com/studio/build/configure-app-module → Status: 0

We can put some of the links that returns 0 or 429 or 403 that are still accessible as exception links in a config file like this with GitHub Actions.

brootware avatar May 09 '22 00:05 brootware

Just fixed the dead link, not sure what happened there.

stark4n6 avatar May 09 '22 00:05 stark4n6

@brootware this is really fantastic. Can you simply PR this into the repo and we'll have it working for us once we figure out hte config?

AndrewRathbun avatar May 09 '22 01:05 AndrewRathbun

https://github.com/brootware/CrowdsourcedDFIRBook/runs/6344168741?check_suite_focus=true#step:4:55 appears to be a false positive, but https://github.com/brootware/CrowdsourcedDFIRBook/runs/6344168741?check_suite_focus=true#step:4:46 appears to be a true positive.

AndrewRathbun avatar May 09 '22 01:05 AndrewRathbun

@AndrewRathbun added exceptions for false positives with this PR. https://github.com/Digital-Forensics-Discord-Server/CrowdsourcedDFIRBook/pull/60

brootware avatar May 09 '22 03:05 brootware

You rule, thank you!

AndrewRathbun avatar May 09 '22 03:05 AndrewRathbun

https://github.com/Digital-Forensics-Discord-Server/CrowdsourcedDFIRBook/runs/6350892948?check_suite_focus=true#step:4:123

@brootware this one appears to be a FP, too. That link is working for me. Any ideas?

AndrewRathbun avatar May 09 '22 11:05 AndrewRathbun

@AndrewRathbun Hi Andrew, I have added that particular link as an exception in the ignore pattern with this PR. https://github.com/Digital-Forensics-Discord-Server/CrowdsourcedDFIRBook/pull/61

Status 503 usually would have some form of web service running and thus did not include this as an exception in the rule.

brootware avatar May 09 '22 23:05 brootware

@AndrewRathbun Hi Andrew, I have added that particular link as an exception in the ignore pattern with this PR. https://github.com/Digital-Forensics-Discord-Server/CrowdsourcedDFIRBook/pull/61

Status 503 usually would have some form of web service running and thus did not include this as an exception in the rule.

Thank you very much for your leadership on this. Really appreciate it!

AndrewRathbun avatar May 10 '22 01:05 AndrewRathbun

My pleasure @AndrewRathbun . I am looking forward to read this book once it's published too! Will leave this issue open as more content is being added in to check for more false positives and will do PRs as we go along and identify.

brootware avatar May 10 '22 02:05 brootware

My pleasure @AndrewRathbun . I am looking forward to read this book once it's published too! Will leave this issue open as more content is being added in to check for more false positives and will do PRs as we go along and identify.

Awesome, really appreciate that support 👍

AndrewRathbun avatar May 10 '22 11:05 AndrewRathbun

Another false positive, it seems: https://github.com/Digital-Forensics-Discord-Server/CrowdsourcedDFIRBook/runs/6633816171?check_suite_focus=true#step:4:75

AndrewRathbun avatar May 28 '22 12:05 AndrewRathbun

Added 502 as exception since usually these are for handling temporary server response errors. https://github.com/Digital-Forensics-Discord-Server/CrowdsourcedDFIRBook/pull/82

brootware avatar May 29 '22 10:05 brootware

@brootware FYI I've added you as a contributor to the book for helping us out with this feature. Thank you for your work on this 👍

AndrewRathbun avatar Jul 30 '22 20:07 AndrewRathbun

Thank you so much @AndrewRathbun . It's a great honour!

brootware avatar Jul 31 '22 09:07 brootware

https://github.com/Digital-Forensics-Discord-Server/TheHitchhikersGuidetoDFIRExperiencesFromBeginnersandExperts/runs/7610278348?check_suite_focus=true

I moved the .md files to their own folder, so now obviously we have tons of errors. Do you think we should just ignore .md files with this and only focus on .txt since that's what Leanpub actually parses?

AndrewRathbun avatar Aug 01 '22 11:08 AndrewRathbun

Hi @AndrewRathbun , let me look into this. We can't really check the links in .txt files as this particular action only supports .md files.

I'm also thinking of implementing a cron to check the links on every cadance. I'm thinkin of like every 2 days at 11pm to run the check. Please let me know what you think.

https://crontab.guru/#0_23_/2__*

0 23 */2 * *

brootware avatar Aug 03 '22 02:08 brootware

Yeah that seems to be a great idea! Thank you!

AndrewRathbun avatar Aug 03 '22 02:08 AndrewRathbun

It seems like most of the errors are coming from images to dead links. I have added an additional ignore pattern with the latest PR https://github.com/Digital-Forensics-Discord-Server/TheHitchhikersGuidetoDFIRExperiencesFromBeginnersandExperts/pull/133.

Interesting how the subsequent runs have stopped recognising these images.

brootware avatar Aug 03 '22 11:08 brootware

Hey @AndrewRathbun, Just got a notification for a dead link from the latest run on my fork https://github.com/brootware/TheHitchhikersGuidetoDFIRExperiencesFromBeginnersandExperts/runs/8197161545?check_suite_focus=true

The source link is here: https://github.com/Digital-Forensics-Discord-Server/TheHitchhikersGuidetoDFIRExperiencesFromBeginnersandExperts/blob/1ae3274f340a4affac53560fe586c9da7cd2618c/manuscript/chapterJ.txt#L94

brootware avatar Sep 06 '22 02:09 brootware

Hey @AndrewRathbun, Just got a notification for a dead link from the latest run on my fork https://github.com/brootware/TheHitchhikersGuidetoDFIRExperiencesFromBeginnersandExperts/runs/8197161545?check_suite_focus=true

The source link is here: https://github.com/Digital-Forensics-Discord-Server/TheHitchhikersGuidetoDFIRExperiencesFromBeginnersandExperts/blob/1ae3274f340a4affac53560fe586c9da7cd2618c/manuscript/chapterJ.txt#L94

Thank you for the heads up! Thankfully that chapter isn't live yet but I appreciate you flagging that for me! 🙏

AndrewRathbun avatar Sep 06 '22 02:09 AndrewRathbun

I'm reaching out to Josh to see if that link can be fixed.

AndrewRathbun avatar Sep 06 '22 04:09 AndrewRathbun

Link is working now!

AndrewRathbun avatar Sep 06 '22 12:09 AndrewRathbun

We're good now! Dead Link Checker passed with flying colors.

AndrewRathbun avatar Sep 06 '22 14:09 AndrewRathbun

@brootware I just added a Spell Checker action. Any chance you can modify the workflow file to ignore anything in .github?

AndrewRathbun avatar Jun 01 '23 11:06 AndrewRathbun

Sure, @AndrewRathbun let me take a look!

brootware avatar Jun 08 '23 05:06 brootware

ok I've modified the workflow file to only target to find deadlinks inside manuscript/ directory and all the links in the content looks good and alive. https://github.com/brootware/TheHitchhikersGuidetoDFIRExperiencesFromBeginnersandExperts/actions/runs/5207913950/jobs/9395934942 Would you happen to have any other files or directories you would like to check dead links for? @AndrewRathbun Else, if all's good I'd like to go ahead and open a PR.

On the spell checker, seems like we need to add more words to allow.txt wordlist

brootware avatar Jun 08 '23 06:06 brootware

ok I've modified the workflow file to only target to find deadlinks inside manuscript/ directory and all the links in the content looks good and alive. https://github.com/brootware/TheHitchhikersGuidetoDFIRExperiencesFromBeginnersandExperts/actions/runs/5207913950/jobs/9395934942 Would you happen to have any other files or directories you would like to check dead links for? @AndrewRathbun Else, if all's good I'd like to go ahead and open a PR.

On the spell checker, seems like we need to add more words to allow.txt wordlist

Thank you for making that change! And yes it's on my to-do list to get a lot of those 800ish words on the allow list.

AndrewRathbun avatar Jun 08 '23 11:06 AndrewRathbun