wayback-machine-downloader
wayback-machine-downloader copied to clipboard
Failed to dl my old site :/
jay@jnetreloaded:~/Downloads/jnet_site/archive_dl$ sudo wayback_machine_downloader jnet.sytes.net Downloading jnet.sytes.net to websites/jnet.sytes.net/ from Wayback Machine archives.
Getting snapshot pages../usr/lib/ruby/3.2.0/open-uri.rb:369:in open_http': 400 BAD REQUEST (OpenURI::HTTPError) from /usr/lib/ruby/3.2.0/open-uri.rb:760:in buffer_open'
from /usr/lib/ruby/3.2.0/open-uri.rb:214:in block in open_loop' from /usr/lib/ruby/3.2.0/open-uri.rb:212:in catch'
from /usr/lib/ruby/3.2.0/open-uri.rb:212:in open_loop' from /usr/lib/ruby/3.2.0/open-uri.rb:153:in open_uri'
from /usr/lib/ruby/3.2.0/open-uri.rb:740:in open' from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader/archive_api.rb:13:in get_raw_list_from_api'
from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:92:in block in get_all_snapshots_to_consider' from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:91:in times'
from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:91:in get_all_snapshots_to_consider' from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:105:in get_file_list_curated'
from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:164:in get_file_list_by_timestamp' from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:309:in file_list_by_timestamp'
from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:192:in download_files' from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/bin/wayback_machine_downloader:72:in <top (required)>'
from /usr/local/bin/wayback_machine_downloader:25:in load' from /usr/local/bin/wayback_machine_downloader:25:in
Getting snapshot pages../usr/lib/ruby/3.2.0/open-uri.rb:369:in open_http': 400 BAD REQUEST (OpenURI::HTTPError) from /usr/lib/ruby/3.2.0/open-uri.rb:760:in buffer_open'
from /usr/lib/ruby/3.2.0/open-uri.rb:214:in block in open_loop' from /usr/lib/ruby/3.2.0/open-uri.rb:212:in catch'
from /usr/lib/ruby/3.2.0/open-uri.rb:212:in open_loop' from /usr/lib/ruby/3.2.0/open-uri.rb:153:in open_uri'
from /usr/lib/ruby/3.2.0/open-uri.rb:740:in open' from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader/archive_api.rb:13:in get_raw_list_from_api'
from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:92:in block in get_all_snapshots_to_consider' from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:91:in times'
from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:91:in get_all_snapshots_to_consider' from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:105:in get_file_list_curated'
from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:164:in get_file_list_by_timestamp' from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:309:in file_list_by_timestamp'
from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:192:in download_files' from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/bin/wayback_machine_downloader:72:in <top (required)>'
from /usr/local/bin/wayback_machine_downloader:25:in load' from /usr/local/bin/wayback_machine_downloader:25:in
Getting snapshot pages../usr/lib/ruby/3.2.0/open-uri.rb:369:in open_http': 400 BAD REQUEST (OpenURI::HTTPError) from /usr/lib/ruby/3.2.0/open-uri.rb:760:in buffer_open'
from /usr/lib/ruby/3.2.0/open-uri.rb:214:in block in open_loop' from /usr/lib/ruby/3.2.0/open-uri.rb:212:in catch'
from /usr/lib/ruby/3.2.0/open-uri.rb:212:in open_loop' from /usr/lib/ruby/3.2.0/open-uri.rb:153:in open_uri'
from /usr/lib/ruby/3.2.0/open-uri.rb:740:in open' from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader/archive_api.rb:13:in get_raw_list_from_api'
from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:92:in block in get_all_snapshots_to_consider' from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:91:in times'
from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:91:in get_all_snapshots_to_consider' from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:105:in get_file_list_curated'
from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:164:in get_file_list_by_timestamp' from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:309:in file_list_by_timestamp'
from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:192:in download_files' from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/bin/wayback_machine_downloader:72:in <top (required)>'
from /usr/local/bin/wayback_machine_downloader:25:in load' from /usr/local/bin/wayback_machine_downloader:25:in
Getting snapshot pages../usr/lib/ruby/3.2.0/open-uri.rb:369:in open_http': 400 BAD REQUEST (OpenURI::HTTPError) from /usr/lib/ruby/3.2.0/open-uri.rb:760:in buffer_open'
from /usr/lib/ruby/3.2.0/open-uri.rb:214:in block in open_loop' from /usr/lib/ruby/3.2.0/open-uri.rb:212:in catch'
from /usr/lib/ruby/3.2.0/open-uri.rb:212:in open_loop' from /usr/lib/ruby/3.2.0/open-uri.rb:153:in open_uri'
from /usr/lib/ruby/3.2.0/open-uri.rb:740:in open' from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader/archive_api.rb:13:in get_raw_list_from_api'
from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:92:in block in get_all_snapshots_to_consider' from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:91:in times'
from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:91:in get_all_snapshots_to_consider' from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:105:in get_file_list_curated'
from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:164:in get_file_list_by_timestamp' from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:309:in file_list_by_timestamp'
from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:192:in download_files' from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/bin/wayback_machine_downloader:72:in <top (required)>'
from /usr/local/bin/wayback_machine_downloader:25:in load' from /usr/local/bin/wayback_machine_downloader:25:in
Getting snapshot pages../usr/lib/ruby/3.2.0/open-uri.rb:369:in open_http': 400 BAD REQUEST (OpenURI::HTTPError) from /usr/lib/ruby/3.2.0/open-uri.rb:760:in buffer_open'
from /usr/lib/ruby/3.2.0/open-uri.rb:214:in block in open_loop' from /usr/lib/ruby/3.2.0/open-uri.rb:212:in catch'
from /usr/lib/ruby/3.2.0/open-uri.rb:212:in open_loop' from /usr/lib/ruby/3.2.0/open-uri.rb:153:in open_uri'
from /usr/lib/ruby/3.2.0/open-uri.rb:740:in open' from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader/archive_api.rb:13:in get_raw_list_from_api'
from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:92:in block in get_all_snapshots_to_consider' from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:91:in times'
from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:91:in get_all_snapshots_to_consider' from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:105:in get_file_list_curated'
from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:164:in get_file_list_by_timestamp' from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:309:in file_list_by_timestamp'
from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:192:in download_files' from /var/lib/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/bin/wayback_machine_downloader:72:in <top (required)>'
from /usr/local/bin/wayback_machine_downloader:25:in load' from /usr/local/bin/wayback_machine_downloader:25:in
@afongemie you mean wayback_machine_downloader jnet.sytes.net ? Have you actually tried it. It raises the same error.
Same here :-(
It seems that the structure of the wayback machine archive service changed a bit...
In wayback_machine_downloader.rb (in /Users/user/.gem/ruby/2.6.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb if you installed here), you can replace the function get_all_snapshots_to_consider in the code by this :
def get_all_snapshots_to_consider
# Note: Passing a page index parameter allow us to get more snapshots,
# but from a less fresh index
print "Getting snapshot pages"
snapshot_list_to_consider = []
snapshot_list_to_consider += get_raw_list_from_api(@base_url, nil)
print "."
unless @exact_url
# @maximum_pages.times do |page_index|
# snapshot_list = get_raw_list_from_api(@base_url + '/*', page_index)
# break if snapshot_list.empty?
# snapshot_list_to_consider += snapshot_list
# print "."
# end
page_index = 0
snapshot_list = get_raw_list_from_api(@base_url + '/*', page_index)
snapshot_list_to_consider += snapshot_list
print "."
end
puts " found #{snapshot_list_to_consider.length} snaphots to consider."
puts
snapshot_list_to_consider
end
It download everything BUT THE LINKS ARE NOT PRESERVED !
https://github.com/StrawberryMaster/wayback-machine-downloader works for me.
https://github.com/StrawberryMaster/wayback-machine-downloader works for me.
How can I get this working in Windows, please?
I have installed Ruby but have no idea where to go from here.
StrawberryMaster did not include much documentation, unfortunately.
How can I get this working in Windows, please?
I have installed Ruby but have no idea where to go from here.
StrawberryMaster did not include much documentation, unfortunately.
Oops, sorry about that @kingmustard. First, make sure Ruby is indeed installed — run ruby -v and see if the version for it displays. If so, Ruby is working.
Assuming you downloaded the default Ruby installation, you probably have Bundler included, so you can type bundle install to download the dependencies, and press enter. (If it doesn't work, run gem install bundler and then follow these steps again.) When that's done, you need to navigate to the folder you extracted WMD's contents to. For example, if you extracted it under a "WMD" folder within your Downloads directory, you'd need to open your terminal (Windows Terminal defaults to Command Prompt, but PowerShell works too) and type
cd Downloads\WMD\bin
OR you could just go to the WMD\bin folder directly, Shift + Right Click anywhere inside it, and click the "Open using Windows Terminal/Powershell" button.
If that worked, you can do ruby wayback_machine_downloader http://example.com (or whatever site you want) and it should work. I've also updated the original documentation so others don't get lost in the future.
How can I get this working in Windows, please? I have installed Ruby but have no idea where to go from here. StrawberryMaster did not include much documentation, unfortunately.
Oops, sorry about that @kingmustard. First, make sure Ruby is indeed installed — run
ruby -vand see if the version for it displays. If so, Ruby is working.Assuming you downloaded the default Ruby installation, you probably have Bundler included, so you can type
bundle installto download the dependencies, and press enter. (If it doesn't work, rungem install bundlerand then follow these steps again.) When that's done, you need to navigate to the folder you extracted WMD's contents to. For example, if you extracted it under a "WMD" folder within your Downloads directory, you'd need to open your terminal (Windows Terminal defaults to Command Prompt, but PowerShell works too) and typecd Downloads\WMD\binOR you could just go to the WMD\bin folder directly, Shift + Right Click anywhere inside it, and click the "Open using Windows Terminal/Powershell" button.If that worked, you can do
ruby wayback_machine_downloader http://example.com(or whatever site you want) and it should work. I've also updated the original documentation so others don't get lost in the future.
I appreciate your help.
'bundle install' did not work but 'gem install bundler' did.
I cannot find anywhere where to download WMD on https://github.com/StrawberryMaster/wayback-machine-downloader. There is nothing in the 'Releases' section.
@kingmustard I should probably fix that - there should be something in the Releases section now. If that doesn't work, you can always can click on the Code button and switch to the Local tab. There should be a "Download Zip" button there.
(If something isn't working, feel free to run bundle install now that you got bundler installed, and then run WMD.)
@kingmustard I should probably fix that - there should be something in the Releases section now. If that doesn't work, you can always can click on the Code button and switch to the Local tab. There should be a "Download Zip" button there.
(If something isn't working, feel free to run
bundle installnow that you got bundler installed, and then run WMD.)
I think we are getting closer 😊 However:
PS C:\Users\Elliot\Desktop\wmd\bin> ruby wayback_machine_downloader http://elliotsworld.co.uk
<internal:C:/Ruby33-x64/lib/ruby/site_ruby/3.3.0/rubygems/core_ext/kernel_require.rb>:136:in `require': cannot load such file -- concurrent-ruby (LoadError)
from <internal:C:/Ruby33-x64/lib/ruby/site_ruby/3.3.0/rubygems/core_ext/kernel_require.rb>:136:in `require'
from C:/Users/Elliot/Desktop/wmd/lib/wayback_machine_downloader.rb:10:in `<top (required)>'
from wayback_machine_downloader:3:in `require_relative'
from wayback_machine_downloader:3:in `<main>'
PS C:\Users\Elliot\Desktop\wmd\bin>
@kingmustard Weird. You can just install concurrent-ruby then, since somehow that's missing, using gem install concurrent-ruby -v 1.3.5 and it should probably work.
@kingmustard Weird. You can just install concurrent-ruby then, since somehow that's missing, using
gem install concurrent-ruby -v 1.3.5and it should probably work.
Hi there,
PS C:\WINDOWS\system32> ruby wayback_machine_downloader http://elliotsworld.co.uk
wayback_machine_downloader: --> wayback_machine_downloader
expected a newline or semicolon after the statementcannot parse the expression
> 1 PID PPID PGID WINPID TTY UID STIME COMMAND
> 2 957 1 957 87932 cons0 197608 13:40:12 /usr/bin/ps
wayback_machine_downloader:2: syntax error, unexpected integer literal, expecting end-of-input (SyntaxError)
957 1 957 87932 cons0 ...
@kingmustard My guess is that you're on the wrong folder here - you should be running it from the place you extracted the folder too, and not System32 (which would be a pretty dangerous place to have it!)
If you extracted it to a folder named "WMD" in your Downloads folder, for example, you'd need to do this in PowerShell:
cd \
cd C:\Users\YOURPROFILENAMEHERE\Downloads\WMD
or just go to your WMD folder in file explorer, copy the link to the folder, and just do cd linkyoucopiedhere. From there, you'll need to go to the bin folder (just do cd bin) and run the commands as normal.
Unfortunately, this is too confusing to use without an exe / GUI.
I appreciate the help you have given me and I hope someone makes a fork some time in the future 😊
@kingmustard Fair! I guess I can look into that and see if it makes things easier. I'm not aware of a fork with a GUI, but there is a Python alternative which you may find easier to install.
related: https://github.com/hartator/wayback-machine-downloader/issues/307