website
website copied to clipboard
Confirm that Internet Archive push is working on every update
Does this still work?
https://github.com/crimethinc/website/issues/451
Let's find out and either close this with no additional effort needed or let's fix the thing so that archive.org always has all of the site's articles.
so, looking into this i think the api has just been going down. Now, rather than 500s, we are getting timeouts
I am having a hard time figuring out if this api is even supported anymore
all that said, I did find out the the Internet Archive has a way to just send an email full of links, and they will archive all of the URLs, and email you back the results: https://blog.archive.org/2019/10/23/the-wayback-machines-save-page-now-is-new-and-improved/
I wonder if it would be more stable to write an ActionMailer job to run nightly and batch process articles based on updated_at >= 1.day.ago
I also found this internet archive browser extension that has a "Save Page Now" feature that is working, so maybe we can extract that code into ruby?
https://github.com/internetarchive/wayback-machine-webextension/blob/2b46d356f625e28ef98b376541edbe5f7203bbb4/webextension/scripts/background.js#L59-L116
fyi, these are the docs for the Save Page Now v2 API: https://docs.google.com/document/d/1Nsv52MvSjbLb2PCpHlat0gkzw0EvtSgpKHu4mk0MnrA/edit#heading=h.1gmodju1d6p0
My buddy did an example implementation in python here: https://github.com/palewire/savepagenow/pull/31
Thanks for the reference @bensheldon we'll give that a look and try to update our current implementation. 😀