perldotcom icon indicating copy to clipboard operation
perldotcom copied to clipboard

Investigate various 404s

Open briandfoy opened this issue 5 years ago • 21 comments
trafficstars

Some of these might have new locations, or we should link into archive.org possibly

  • [ ] http://ali.as/CPAN/Squish.html
  • [ ] http://autrijus.org/pugs/
  • [ ] http://autrijus.org/tmp/itype.patch
  • [ ] http://blogs.perl.org/users/leon_timmermans/2013/05/why-you-dont-need-fileslurp.html
  • [ ] http://catless.ncl.ac.uk/php/risks/search.php?query=rounding
  • [ ] http://docs.parrot.org/parrot/devel/html/PCT_Tutorial.html
  • [ ] http://einstein.drexel.edu/pages/students/kohr/pages/Who_is_Mr.Yuck.html
  • [ ] http://google-code-prettify.googlecode.com/svn/trunk/styles/index.html
  • [ ] http://groups.google.com/group/comp.unix.shell/browse_frm/thread/31da%0A970cebb30c6d?hl=en&pli=1
  • [ ] http://jryan.perlmonk.org/images/jirate.tar.gz
  • [ ] http://london.openguides.org/index.cgi?action=index;index_type=category;index_value=Pubs
  • [ ] http://london.openguides.org/index.cgi?action=index;index_type=category;index_value=Pubs;format=rdf
  • [ ] http://london.openguides.org/index.cgi?action=index;index_type=locale;index_value=Islington
  • [ ] http://london.openguides.org/index.cgi?distance_in_metres=500&id=Piccadilly+Circus+Station&action=find_within_distance&Go=Go
  • [ ] http://london.pm.org/pipermail/london.pm/Week-of-Mon-20020701/011937.html
  • [ ] http://london.pm.org/pipermail/london.pm/Week-of-Mon-20020729/012366.html
  • [ ] http://members.home.net/bcwarno/Perl6/percy.jpg
  • [ ] http://moskalyuk.com/software/perl/search/kiss.htm
  • [ ] http://my.labschool.org/
  • [ ] http://perl.apache.org/dist/
  • [ ] http://perl.apache.org/dist/contrib/
  • [ ] http://perl.apache.org/distributions.html.
  • [ ] http://perl.apache.org/netcraft/
  • [ ] http://perl.apache.org/stories/
  • [ ] http://perl.org.il/pipermail/perl/2003-October/003151.html
  • [ ] http://perlcomposer.sourceforge.net/vperl.html
  • [ ] http://perso.ens-lyon.fr/alexandre.buisse/divers/gmc_for_dummies.pod
  • [ ] http://pirate.tangentcode.com
  • [ ] http://ramenlabs.com/pleac-pdf/pleac_perl.pdf
  • [ ] http://rto.dk/images/camel.html
  • [ ] http://rto.dk/images/llama.html
  • [ ] http://savage.net.au/Perl-modules/html/graph.easy.marpa/
  • [ ] http://savage.net.au/Perl-modules/html/graphviz2.marpa/default.stt.html
  • [ ] http://savage.net.au/Perl-modules/html/graphviz2.pathutils/
  • [ ] http://savage.net.au/Ron/html/writing.graph.easy.marpa.html%3E
  • [ ] http://search.cpan.org/doc/MSERGEANT/AxKit-1.51/lib/Apache/AxKit/Language/XSP/SimpleTaglib.pm
  • [ ] http://search.cpan.org/doc/MSERGEANT/AxKit-1.51/lib/Apache/AxKit/Language/XSP/TaglibHelper.pm
  • [ ] http://search.cpan.org/doc/MSERGEANT/AxKit-1.6/lib/Apache/AxKit/Language/AxPoint.pm
  • [ ] http://search.cpan.org/doc/MSERGEANT/AxKit-XSP-PerForm-1.6/PerForm.pm
  • [ ] http://sysbio.harvard.edu/csb/resources/computational/scriptome/
  • [ ] http://the.earth.li/~simon/cgi-bin/repository
  • [ ] http://users.ox.ac.uk/~shug0957/
  • [ ] http://world.std.com/~aep/ptkdb/
  • [ ] http://www.catch22.net/software/winspy.asp
  • [ ] http://www.cpan.org/%20
  • [ ] http://www.cpan.org/authors/id/S/SF/SFINK/parrot-0.0.11.2.tar.gz
  • [ ] http://www.cpan.org/authors/id/S/SI/SIMON/parrot-0.0.3.tar.gz
  • [ ] http://www.cpan.org/modules/by-authors/id/J/JG/JGOFF/parrot-0.0.7.tgz
  • [ ] http://www.cpan.org/src/parrot-0.0.1.tar.gz
  • [ ] http://www.cpan.org/src/perl-5.7.2.tar.gz
  • [ ] http://www.cris.com/~automata/tutorial.shtml
  • [ ] http://www.cs.cmu.edu/~lenzo/SlideShow/
  • [ ] http://www.cs.uu.nl/daan/parsec.html
  • [ ] http://www.dmclaughlin.com/2009/04/19/ugly-perl-a-lesson-in-the-importance-of-api-design/
  • [ ] http://www.example.com/perl/access/access.cgi
  • [ ] http://www.example.com/perl/access/access.cgi?do_sub=query_form
  • [ ] http://www.flirble.org/~nick/P/ex-lib-zip-0.01.tar.gz
  • [ ] http://www.gnu.org/manual/diffutils-2.8.1/html_node/Detailed-Normal.html
  • [ ] http://www.gnu.org/software/emacs/elisp-manual/html_mono/elisp.html
  • [ ] http://www.gnu.org/software/make/manual/html_node/make_toc.html
  • [ ] http://www.graphviz.org/content/attrs
  • [ ] http://www.graphviz.org/webdot/
  • [ ] http://www.gvu.gatech.edu/ccg
  • [ ] http://www.helpconsulting.net/visiperl/
  • [ ] http://www.hitchhiker.org/parrot_coverage
  • [ ] http://www.icalx.com/public/PerlConf/Perl32Conferences.ics
  • [ ] http://www.icogitate.com/~perl/sue
  • [ ] http://www.klaascuvelier.be/2013/06/sublime-command-on-save/
  • [ ] http://www.microsoft.com/com/dcom/dcom95/dcom1_3.asp
  • [ ] http://www.molecularcloning.com/public/tour/index.html
  • [ ] http://www.panix.com/~ziggy/parrot.html
  • [ ] http://www.primate.wisc.edu/software/csh-tcsh-book/
  • [ ] http://www.psdt.com/news/yapc-canada.html
  • [ ] http://www.research.ibm.com/remail/
  • [ ] http://www.rsasecurity.com/rsalabs/pkcs/pkcs-3/
  • [ ] http://www.rscheme.org/
  • [ ] http://www.soaprpc.com
  • [ ] http://www.tuxedo.org/~esr/writings/taoup
  • [ ] http://www.unicode.org/unicode/faq/
  • [ ] http://www.unicode.org/unicode/reports/tr10/
  • [ ] http://www.unicode.org/unicode/reports/tr18/
  • [ ] http://www.vendian.org/parrot/wiki/bin/view.cgi/Main/GettingStartedWithParrotDevelopment
  • [ ] http://www.wall.org/~larry/apo
  • [ ] http://www.wall.org/~larry/syn
  • [ ] http://xml.apache.org/xml-soap
  • [ ] http://xmlrpc-c.sourceforge.net/xmlrpc-howto/xmlrpc-howto.html
  • [ ] https://docs.exoplatform.org/exo-documents/exo-jcr.site/index.html
  • [ ] https://farm9.staticflickr.com/8640/15943126852_07692bfc09.jpg
  • [ ] https://gist.github.com/sillymoose/998b9199007589199dce#file-get_swift_code-pl-L42
  • [ ] https://github.com/briandfoy/ghojo/blob/master/examples/hacktoberfest.pl
  • [ ] https://github.com/dnmfarrell/Pod-Perl5/blob/master/lib/Pod/Perl5/ToHTML.pm
  • [ ] https://github.com/dnmfarrell/zeroclickinfo-goodies/tree/perldoc/share/goodie/perldoc_cheat_sheet
  • [ ] https://github.com/duckduckgo/zeroclickinfo-goodies/tree/master/share/goodie/tmux_cheat_sheet
  • [ ] https://github.com/sillymoose/Module-Minter/blob/master/lib/Module/Minter.pm6
  • [ ] https://gitter.im/duckduckgo/zeroclickinfo-goodies?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge
  • [ ] https://kyzn.org/2015-01-17-cpan-pr-challenge-012015.html
  • [ ] https://lh6.googleusercontent.com/-lnPMdjxpfJo/UIpNj3jviTI/AAAAAAAALNE/nG_sATcqE0E/w750-h600-no/Perl+Script+-+Stroll+in+the+park.jpeg
  • [ ] https://pragprog.com/pragdave/Practices/CodeKata.rdoc
  • [ ] https://www.flickr.com/photos/brightmeadow/3748310435/in/photolist-6He56Z-bDdcmL-5Jp3Z-aZWgk-aaGbZM-aZWfK-5uGDfb-63MA6m-88qSJK-6B33mX-76En59-6N6eHG-5UFiwj-3rXHK-aZWiH-4CmaD2-6vWgnX-3bai1p-c3CSTq-3PChVM-7hdnBS-2iYPPt-8Vx4Eo-4Cmav8-6P8qMy-jfddWn-4RoQjt-5ZrohQ-eQikQL-dGWiLV-4C7epr-dH2HeL-4C7eve-bnpqbW-4CmavB-8Nvnmc-8SfZR6-3ppzd-7PEzCG-FLPq-9gXmeE-dGWi5t-8Sg3sF-7h9qon-8EWHyq-dGWhC6-buGn9s-c1AukG-7VSc8B-dRCTcZ
  • [ ] https://www.linux.com/learn/tutorials/362602-how-to-compile-the-linux-kernel
  • [ ] https://www.reddit.com/r/perl/widget
  • [ ] https://www.xml.com/pub/2002/06/19/perl-xml.html

briandfoy avatar Jul 14 '20 19:07 briandfoy

Just FYI, I've started to work on fixing those links. I should have finished by the end of the week depending on my free time.

edipretoro avatar Oct 05 '20 20:10 edipretoro

GM,

I am assuming this issue is still open? I do not see a pull request for this issue. I have attached a csv file of the 102 entries that are listed and included the return status and the web.archive.org latest archive url. I am not sure if this is what you were looking for or not.

Pls let me know.
Thanks, Jeff git_tpf_perldotcom_257_20201116064308.zip

wpr-curly2000 avatar Nov 16 '20 12:11 wpr-curly2000

Thanks so much for the spreadsheet! That's amazing work!

The trick now is to start to go through them individually to figure out how to best add them into the articles.

I'll look at this when I next have time to do some perl.com things.

briandfoy avatar Nov 17 '20 23:11 briandfoy

Hi Brian - Thank You for the kind words! Thanks, Jeff

On Tue, Nov 17, 2020 at 6:13 PM brian d foy [email protected] wrote:

Thanks so much for the spreadsheet! That's amazing work!

The trick now is to start to go through them individually to figure out how to best add them into the articles.

I'll look at this when I next have time to do some perl.com things.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tpf/perldotcom/issues/257#issuecomment-729270478, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA4UGROUCMOYJQ25WNLEP4DSQL7Q5ANCNFSM4OZ2E3WQ .

-- Jeff Pelkey

wpr-curly2000 avatar Nov 18 '20 00:11 wpr-curly2000

HI Brian,

Does anymore coding or add'l work need to be done on this issue? If not, do I need to create a pull request?

Pls let me know. Thanks, Jeff

wpr-curly2000 avatar Nov 18 '20 18:11 wpr-curly2000

What do you think we should do with it? I haven't quite figured that what might be good.

Maybe we can mixup all those links with the Archive.org links and add an "archive" class the A tag:

<a class="archive" href="...">

briandfoy avatar Nov 18 '20 20:11 briandfoy

That's no problem. Pls peruse the new file and let me know if that is what you wanted or not. git_tpf_perldotcom_257_20201122211448.zip

wpr-curly2000 avatar Nov 23 '20 02:11 wpr-curly2000

Format looks fine, thanks for doing that. I'm probably not going to be able to get to this until after the new year, but it's good that you are making it so it should be easy.

briandfoy avatar Dec 01 '20 02:12 briandfoy

Do you want me to update the links in the pages? I have forked a copy of tpf branch.

wpr-curly2000 avatar Dec 01 '20 12:12 wpr-curly2000

FYI, I've already started the work of updating the links, but it was taking more time than available, so that task stalled. I've forked and already updated 23 files but I haven't pushed those modifications. I thought it would be easier to have one big commit instead of n small commits. But if @wpr-curly2000 is willing to do the job, I can push my changes, and he can continue to fix broken URLs. In the meantime, I've created a PR with those modifications.

edipretoro avatar Dec 02 '20 08:12 edipretoro

@edipretoro, @wpr-curly2000 if you two want to work together, go for it.

I had an idea on how we can handle the broken links, so let me know what you think about this.

I've made the branch briandfoy/404-links to try it. I've tried it in one file, content/article/netanel-rubins-perljam-circus.md.

I used a shortcode from layouts/shortcodes/stale-link.html:

{{<stale-link "the PLEAC project's Perl recommendations from 1999" "http://web.archive.org/web/20160317035754/ramenlabs.com/pleac-pdf/pleac_perl.pdf" "http://ramenlabs.com/pleac-pdf/pleac_perl.pdf">}}

This produces an A tag with the old URL that looks something like:

<a class="stale-link" href="http://web.archive.org/..." data-original="http://ramenlabs.com/...">the ...</a>

For now, this doesn't do much, but with the shortcode we can easily change it around for a tooltip to explain the new link or whatever we decide we want to do. And, we don't need to know what that is right now.

And, if you want to play with this on your own, you need Hugo 0.59.0 exactly. Then, the make start target makes a local website and starts the local server.

briandfoy avatar Dec 04 '20 23:12 briandfoy

Hi Brian and Emmanual,

I think that is an awesome idea! This is my first project working with TPF. So I do not know my way around very easily and am picking things here and there as I grep through the content. Case in point, I installed Hugo v0.79. So know I know that I need v0.59 to do testing - Thanks Brian for pointing that out.

I should point out that I start development on a deployment that goes live in February and another deployment in April. So I might be the one with limited time after 1/1. So I think this proposal will help us get this issue across that finish line. I think as a lesson learned - If either of us is bogged down and does not have time to work on this issue. Send a quick email every couple of weeks? Just say: "I do not have any spare time currently".

Emmanuel - Pls let me know what you think about this proposal from Brian? Thanks, Jeff

On Fri, Dec 4, 2020 at 6:05 PM brian d foy [email protected] wrote:

@edipretoro https://github.com/edipretoro, @wpr-curly2000 https://github.com/wpr-curly2000 if you two want to work together, go for it.

I had an idea on how we can handle the broken links, so let me know what you think about this.

I've made the branch briandfoy/404-links to try it. I've tried it in one file, content/article/netanel-rubins-perljam-circus.md.

I used a shortcode from layouts/shortcodes/stale-link.html:

{{<stale-link "the PLEAC project's Perl recommendations from 1999" "http://web.archive.org/web/20160317035754/ramenlabs.com/pleac-pdf/pleac_perl.pdf" "http://ramenlabs.com/pleac-pdf/pleac_perl.pdf">}}

This produces an A tag with the old URL that looks something like:

the ...

For now, this doesn't do much, but with the shortcode we can easily change it around for a tooltip to explain the new link or whatever we decide we want to do. And, we don't need to know what that is right now.

And, if you want to play with this on your own, you need Hugo 0.59.0 exactly. Then, the make start target makes a local website and starts the local server.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tpf/perldotcom/issues/257#issuecomment-739068753, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA4UGRIE52XRGLRWFEOMGCDSTFTLNANCNFSM4OZ2E3WQ .

-- Jeff Pelkey

wpr-curly2000 avatar Dec 05 '20 13:12 wpr-curly2000

I was starting to work on the remaining URL's and noticed that the file "git_tpf_perldotcom_257_20201122211448.zip" had the column names and third column for the archive.org url was totally munged-up. I removed them and cleaned this up.

Another issue I found was some the timestamps from archive.org are missing. I will fix this and re-post the file. Then once I have all the timestamps corrected. Merge the appropriate archive.org URL's into the updated pages and upload to GIT.

wpr-curly2000 avatar Dec 13 '20 16:12 wpr-curly2000

I was starting to work on the remaining URL's and noticed that the file "git_tpf_perldotcom_257_20201122211448.zip" had the column names and third column for the archive.org url was totally munged-up. I removed them and cleaned this up.

Another issue I found was some the timestamps from archive.org are missing. I will fix this and re-post the file. Then once I have all the timestamps corrected. Merge the appropriate archive.org URL's into the updated pages and upload to GIT.

wpr-curly2000 avatar Dec 13 '20 16:12 wpr-curly2000

Yeah, that file was a little weird but still useful. I think what would help me the most are the fields:

  • source file
  • broken URL
  • archive URL (if there is one)

If you've like to put that in a gist as a CSV, it might be more convenient too.

Also, if you'd like to make a tiny commit (like one file) in the briandfoy/404-links branch, you'll show up in the list of contributors. :)

briandfoy avatar Dec 14 '20 14:12 briandfoy

I have update 55 documents that do not include the 23 pages from @edipretoro updated list. I also included the 23 document from that list and test and came up with 78. I manually validated a couple of the updated pages and still need to validate all the links in each page by running the test code.

wpr-curly2000 avatar Dec 22 '20 12:12 wpr-curly2000

Okay, noted. I'm basically ignoring things until after the holidays though :)

briandfoy avatar Dec 22 '20 14:12 briandfoy

I am just getting into this too.

wpr-curly2000 avatar Jan 14 '21 12:01 wpr-curly2000

HI,

I have created pull request #298 with the final 37 updated documents with the updated 'web.archive.org' links. I have also included a list of the links that did not have a corresponding link in archive.org:

(zsh) $> grep -ni 'skip' update_404_url.job | grep -v 'Skipped:' 89: SKIPPING URL: https://github.com/briandfoy/ghojo/blob/master/examples/hacktoberfest.pl with Return Code: 599. 110: SKIPPING URL: https://github.com/sillymoose/Module-Minter/blob/master/lib/Module/Minter.pm6 with Return Code: 599. 131: SKIPPING URL: https://gist.github.com/sillymoose/998b9199007589199dce#file-get_swift_code-pl-L42 with Return Code: 599. 177: SKIPPING URL: https://www.flickr.com/photos/brightmeadow/3748310435/in/photolist-6He56Z-bDdcmL-5Jp3Z-aZWgk-aaGbZM-aZWfK-5uGDfb-63MA6m-88qSJK-6B33mX-76En59-6N6eHG-5UFiwj-3rXHK-aZWiH-4CmaD2-6vWgnX-3bai1p-c3CSTq-3PChVM-7hdnBS-2iYPPt-8Vx4Eo-4Cmav8-6P8qMy-jfddWn-4RoQjt-5ZrohQ-eQikQL-dGWiLV-4C7epr-dH2HeL-4C7eve-bnpqbW-4CmavB-8Nvnmc-8SfZR6-3ppzd-7PEzCG-FLPq-9gXmeE-dGWi5t-8Sg3sF-7h9qon-8EWHyq-dGWhC6-buGn9s-c1AukG-7VSc8B-dRCTcZ with Return Code: 599. 223: SKIPPING URL: https://github.com/dnmfarrell/Pod-Perl5/blob/master/lib/Pod/Perl5/ToHTML.pm with Return Code: 599. 252: SKIPPING URL: https://lh6.googleusercontent.com/-lnPMdjxpfJo/UIpNj3jviTI/AAAAAAAALNE/nG_sATcqE0E/w750-h600-no/Perl+Script+-+Stroll+in+the+park.jpeg with Return Code: 599. 274: SKIPPING URL: http://catless.ncl.ac.uk/php/risks/search.php?query=rounding with Return Code: 599. 370: SKIPPING URL: http://www.cpan.org/%20 with Return Code: 599. 391: SKIPPING URL: https://kyzn.org/2015-01-17-cpan-pr-challenge-012015.html with Return Code: 599. 412: SKIPPING URL: https://github.com/dnmfarrell/zeroclickinfo-goodies/tree/perldoc/share/goodie/perldoc_cheat_sheet with Return Code: 599. 417: SKIPPING URL: https://github.com/duckduckgo/zeroclickinfo-goodies/tree/master/share/goodie/tmux_cheat_sheet with Return Code: 599. 422: SKIPPING URL: https://gitter.im/duckduckgo/zeroclickinfo-goodies?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge with Return Code: 599. 800: SKIPPING URL: http://www.cpan.org/src/perl-5.7.2.tar.gz with Return Code: 599. 871: SKIPPING URL: http://www.cpan.org/src/parrot-0.0.1.tar.gz with Return Code: 599. 1033: SKIPPING URL: http://members.home.net/bcwarno/Perl6/percy.jpg with Return Code: 599. 1152: SKIPPING URL: http://www.example.com/perl/access/access.cgi with Return Code: 599. 1174: SKIPPING URL: http://search.cpan.org/doc/MSERGEANT/AxKit-1.51/lib/Apache/AxKit/Language/XSP/SimpleTaglib.pm with Return Code: 599. 1179: SKIPPING URL: http://search.cpan.org/doc/MSERGEANT/AxKit-1.51/lib/Apache/AxKit/Language/XSP/TaglibHelper.pm with Return Code: 599. 1184: SKIPPING URL: http://search.cpan.org/doc/MSERGEANT/AxKit-1.6/lib/Apache/AxKit/Language/AxPoint.pm with Return Code: 599. 1189: SKIPPING URL: http://search.cpan.org/doc/MSERGEANT/AxKit-XSP-PerForm-1.6/PerForm.pm with Return Code: 599. 1194: SKIPPING URL: https://www.xml.com/pub/2002/06/19/perl-xml.html with Return Code: 599. 1215: SKIPPING URL: http://www.cpan.org/modules/by-authors/id/J/JG/JGOFF/parrot-0.0.7.tgz with Return Code: 599. 1286: SKIPPING URL: http://www.example.com/perl/access/access.cgi with Return Code: 599. 1332: SKIPPING URL: http://xml.apache.org/xml-soap with Return Code: 599. 1469: SKIPPING URL: http://www.cpan.org/authors/id/S/SF/SFINK/parrot-0.0.11.2.tar.gz with Return Code: 599. 1688: SKIPPING URL: http://www.cs.uu.nl/daan/parsec.html with Return Code: 599. 2074: SKIPPING URL: http://savage.net.au/Ron/html/writing.graph.easy.marpa.html%3E with Return Code: 599.

Not sure if anything else can be done with these links?

wpr-curly2000 avatar Jan 20 '21 02:01 wpr-curly2000

Hi Brian,

I was going through the email chain above after reading your input in #298 and forgot about the suggestion you made regarding the stale-link shortcode. I will review the list of 37 links above. Those links that have relocated, I will update and the others update with the stale-link shortcode. Then add what I find to the next pull request and go from there.

wpr-curly2000 avatar Jan 21 '21 02:01 wpr-curly2000

Don't worry about the shortcode right now. I think what you're doing with archive.org is okay for now. The original URL is still there, so let's plug along with that and figure out other stuff later.

briandfoy avatar Jan 21 '21 10:01 briandfoy