wayback-machine-downloader
wayback-machine-downloader copied to clipboard
INSTRUCTIONS! Use this working version instead.
This repository is outdated, so people use an up to date fork instead, this "issue" is just instructions for people on how to get it working.
- Install required ruby version
- Download zip / clone fork from ShiftaDeband https://github.com/ShiftaDeband/wayback-machine-downloader
- Navigate to
wayback-machine-downloader\bin - On windows, you can launch power shell by
shift + right clickempty space in folder and selectingOpen power shell - From here you can run the program with
ruby wayback_machine_downloaderinstead of normalwayback_machine_downloaderif you installed it as gem
You can also uninstall the original non functional gem if you installed it previously with gem uninstall wayback_machine_downloader
Don't forget to star ShiftaDeband
worked, thanks
Thanks for this!
Ironic that a tool for recovering dead websites has itself died.
It returns that the download is completed, however there are no files at all in the directory.
I used a custom path so I know I am looking in the right place.
ruby D:\sites\wayback-machine-downloader-feature-httpGet\bin\wayback_machine_downloader http://metal-gear-survive-map.de/t/ -f 20201109025636 -a -d D:\downloads\survive is what I used. I got no errors when I used it, the files just don't exist.
do not working too, just showing "Getting snapshot pages......"
Same bro I just want to download an old site
it only downloads the index.html for me, how can I fix this?
This worked for me! I had to wait a few minutes after seeing "Getting snapshot pages". The site I was interested in was small, so I'd assume if you're trying to download something large, it will take longer.
Worked for me!
Why hasn't this fix been merged into the main branch yet?
Got this working under ubuntu 22.04 WSL2!
I just used "~/wayback-machine-downloader/bin$ ruby wayback_machine_downloader -t 20240906185059 -d /home/yourusername/yourdownloadfolderorwhatever http://domain.com/"
Confirmed still working today!
the downloading works, but it often creates files like "%3fKovuLKD ThisIsMyStory2.wmv 100 43816" when it thinks ".wmv 100 43816" is the filetype, rather than .html. i have a program that can turn these into .html in small batches, but if there's a way to just stop it from happening that'd be nice. also, what am i supposed to do once the websites are downloaded? i can't really navigate them like i would an actual website; i have to manually click each html to see the pages
Thanks for the update, this works as a charm.
But! :-) I like to download the files edited by wbm. Coz i get no permalinks and sites willl work smooth offline
Not working anymore
Just trying and the fork https://github.com/ShiftaDeband/wayback-machine-downloader is still working.
After succesfully installing the fork from ShiftaDeband (with gem install) and running it on a site, 30 seconds or so after "Getting snapshot pages" I get the error
../System/Library/Frameworks/Ruby.framework/Versions/2.6/usr/lib/ruby/2.6.0/open-uri.rb:378:in `open_http': 400 BAD REQUEST (OpenURI::HTTPError)
from /System/Library/Frameworks/Ruby.framework/Versions/2.6/usr/lib/ruby/2.6.0/open-uri.rb:756:in `buffer_open'
from /System/Library/Frameworks/Ruby.framework/Versions/2.6/usr/lib/ruby/2.6.0/open-uri.rb:226:in `block in open_loop'
from /System/Library/Frameworks/Ruby.framework/Versions/2.6/usr/lib/ruby/2.6.0/open-uri.rb:224:in `catch'
from /System/Library/Frameworks/Ruby.framework/Versions/2.6/usr/lib/ruby/2.6.0/open-uri.rb:224:in `open_loop'
from /System/Library/Frameworks/Ruby.framework/Versions/2.6/usr/lib/ruby/2.6.0/open-uri.rb:165:in `open_uri'
from /System/Library/Frameworks/Ruby.framework/Versions/2.6/usr/lib/ruby/2.6.0/open-uri.rb:736:in `open'
from /Library/Ruby/Gems/2.6.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader/archive_api.rb:13:in `get_raw_list_from_api'
from /Library/Ruby/Gems/2.6.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:92:in `block in get_all_snapshots_to_consider'
from /Library/Ruby/Gems/2.6.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:91:in `times'
from /Library/Ruby/Gems/2.6.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:91:in `get_all_snapshots_to_consider'
from /Library/Ruby/Gems/2.6.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:105:in `get_file_list_curated'
from /Library/Ruby/Gems/2.6.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:164:in `get_file_list_by_timestamp'
from /Library/Ruby/Gems/2.6.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:309:in `file_list_by_timestamp'
from /Library/Ruby/Gems/2.6.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:192:in `download_files'
from /Library/Ruby/Gems/2.6.0/gems/wayback_machine_downloader-2.3.1/bin/wayback_machine_downloader:72:in `<top (required)>'
from /usr/local/bin/wayback_machine_downloader:23:in `load'
from /usr/local/bin/wayback_machine_downloader:23:in `<main>'
And also I tried using the docker version. Same thing; I get this:
Getting snapshot pages../usr/local/lib/ruby/2.3.0/open-uri.rb:359:in `open_http': 400 BAD REQUEST (OpenURI::HTTPError)
from /usr/local/lib/ruby/2.3.0/open-uri.rb:737:in `buffer_open'
from /usr/local/lib/ruby/2.3.0/open-uri.rb:212:in `block in open_loop'
from /usr/local/lib/ruby/2.3.0/open-uri.rb:210:in `catch'
from /usr/local/lib/ruby/2.3.0/open-uri.rb:210:in `open_loop'
from /usr/local/lib/ruby/2.3.0/open-uri.rb:151:in `open_uri'
from /usr/local/lib/ruby/2.3.0/open-uri.rb:717:in `open'
from /usr/local/lib/ruby/2.3.0/open-uri.rb:35:in `open'
from /build/lib/wayback_machine_downloader/archive_api.rb:8:in `get_raw_list_from_api'
from /build/lib/wayback_machine_downloader.rb:92:in `block in get_all_snapshots_to_consider'
from /build/lib/wayback_machine_downloader.rb:91:in `times'
from /build/lib/wayback_machine_downloader.rb:91:in `get_all_snapshots_to_consider'
from /build/lib/wayback_machine_downloader.rb:105:in `get_file_list_curated'
from /build/lib/wayback_machine_downloader.rb:168:in `get_file_list_by_timestamp'
from /build/lib/wayback_machine_downloader.rb:309:in `file_list_by_timestamp'
from /build/lib/wayback_machine_downloader.rb:192:in `download_files'
from /build/bin/wayback_machine_downloader:72:in `<top (required)>'
from /usr/local/bundle/bin/wayback_machine_downloader:17:in `load'
from /usr/local/bundle/bin/wayback_machine_downloader:17:in `<main>'
Thanks @azcguy. I followed your tutorial using the fork shared by @MickGe (ShiftaDeband/wayback-machine-downloader) — it worked perfectly. Thanks guys!
It works. It downloaded the website I wanted to backup.
But the Index.html that it created doesn't seems to work. When I open it, its URLL show this "file:///D:/wayback-machine-downloader-feature-httpGet/wayback-machine-downloader-feature-httpGet/bin/websites/finalfantasyixarchive.110mb.com/index.html" and it's loading and loading and nothing gets displayed in my browser.
Downloading XXXX to websites/XXXX from Wayback Machine archives.
/usr/lib/ruby/3.4.0/net/protocol.rb:46:in 'OpenSSL::SSL::SSLSocket#connect_nonblock': SSL_connect returned=1 errno=0 peeraddr=207.241.237.3:443 state=error: certificate verify failed (unable to get certificate CRL) (OpenSSL::SSL::SSLError)
from /usr/lib/ruby/3.4.0/net/protocol.rb:46:in 'Net::Protocol#ssl_socket_connect'
from /usr/lib/ruby/3.4.0/net/http.rb:1736:in 'Net::HTTP#connect'
from /usr/lib/ruby/3.4.0/net/http.rb:1636:in 'Net::HTTP#do_start'
from /usr/lib/ruby/3.4.0/net/http.rb:1631:in 'Net::HTTP#start'
from /home/wayback-machine-downloader/lib/wayback_machine_downloader.rb:88:in 'WaybackMachineDownloader#get_all_snapshots_to_consider'
from /home/wayback-machine-downloader/lib/wayback_machine_downloader.rb:135:in 'WaybackMachineDownloader#get_file_list_all_timestamps'
from /home/wayback-machine-downloader/lib/wayback_machine_downloader.rb:162:in 'WaybackMachineDownloader#get_file_list_by_timestamp'
from /home/wayback-machine-downloader/lib/wayback_machine_downloader.rb:317:in 'WaybackMachineDownloader#file_list_by_timestamp'
from /home/wayback-machine-downloader/lib/wayback_machine_downloader.rb:196:in 'WaybackMachineDownloader#download_files'
from wayback-machine-downloader/bin/wayback_machine_downloader:72:in '<main>'
Downloading XXXX to websites/XXXX from Wayback Machine archives. /usr/lib/ruby/3.4.0/net/protocol.rb:46:in 'OpenSSL::SSL::SSLSocket#connect_nonblock': SSL_connect returned=1 errno=0 peeraddr=207.241.237.3:443 state=error: certificate verify failed (unable to get certificate CRL) (OpenSSL::SSL::SSLError) from /usr/lib/ruby/3.4.0/net/protocol.rb:46:in 'Net::Protocol#ssl_socket_connect' from /usr/lib/ruby/3.4.0/net/http.rb:1736:in 'Net::HTTP#connect' from /usr/lib/ruby/3.4.0/net/http.rb:1636:in 'Net::HTTP#do_start' from /usr/lib/ruby/3.4.0/net/http.rb:1631:in 'Net::HTTP#start' from /home/wayback-machine-downloader/lib/wayback_machine_downloader.rb:88:in 'WaybackMachineDownloader#get_all_snapshots_to_consider' from /home/wayback-machine-downloader/lib/wayback_machine_downloader.rb:135:in 'WaybackMachineDownloader#get_file_list_all_timestamps' from /home/wayback-machine-downloader/lib/wayback_machine_downloader.rb:162:in 'WaybackMachineDownloader#get_file_list_by_timestamp' from /home/wayback-machine-downloader/lib/wayback_machine_downloader.rb:317:in 'WaybackMachineDownloader#file_list_by_timestamp' from /home/wayback-machine-downloader/lib/wayback_machine_downloader.rb:196:in 'WaybackMachineDownloader#download_files' from wayback-machine-downloader/bin/wayback_machine_downloader:72:in '<main>'
Looks like an issue with your Ruby environment. Try this non-Ruby solution instead: https://github.com/birbwatcher/wayback-machine-downloader