wayback-machine-downloader only downloads an index.html file rather than entire site

trafficstars

hello,

title says it all. I'm trying to download the entire website; but it only downloads the index page. even the index page has no pictures and the links don't work. I've tried adding a switch to include all time stamps; but same thing. all timestamped sites only have the index.html. I've reviewed the readme; but as far as I can tell - no switches ought to pull the entire site.

came someone please advise if this is a bug, or whether I need to do something different.

I'm using latest version of Ruby ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x64-mingw32]

thank you

Sep 30 '21 21:09 wingfield65

I have the same problem

Oct 06 '21 07:10 FreeBSoD

been observing the same

Oct 09 '21 12:10 fsacer

Same here, only gets the index, and then one .js file..

Oct 11 '21 08:10 vejnoe

Yep. Same here. Waiting for a resolution.

Oct 19 '21 18:10 DIntriglia

Greetings, I am also having this issue. Only getting a single page downloaded when tested on 2 different sites. Hoping a resolution comes soon.

Cheers!

Oct 29 '21 05:10 ReppiksProductions

How do i download the entire website? and not only the index page

Nov 12 '21 10:11 mmaaarten

Bumping this issue, hoping I can get a complete version of the whole site, instead of a bare HTML site 😄

Nov 14 '21 10:11 cryptoAlgorithm

The same issue for me, a content creator I enjoyed recently passed, and his site is gone no longer online.

I'd love to get an archive.

Nov 19 '21 21:11 adampatterson

I am having same issue. But I solved by using Ruby 2.6.9. And I put NO Optional options. So it works with me with command like : "wayback_machine_downloader http://www.example.com"

Dec 08 '21 09:12 woxinwuchen

@woxinwuchen Can you confirm how you switched your ruby env for this project? I am on Mac, and my ruby version is 2.6.3p62.

Dec 08 '21 15:12 DIntriglia

I think I am getting this same bug, if I give the script url like http://example.com, it just downloads the homepage over and over, and none of the subpages.

My guess is that it has to do with the Wayback CDX Server URL matching: https://github.com/internetarchive/wayback/tree/master/wayback-cdx-server#url-match-scope

My fix for this issue was to change the parameter to format of http://example.com/*

Jan 23 '22 10:01 Krisseck

Same here... just the index.html file and no subfolders, images etc.

I tried upgrading to ruby 2.6.9 with no luck running Monterey (12.4). No dice with http://example.com/* and/or running the command straight with no options.

On a happier note I learned how to upgrade ruby by installing rbenv via homebrew.

Jul 21 '22 14:07 twobitdigitalpreservation

I have the same issue, and adding a * to the end of the url didn't help here

Aug 03 '22 11:08 Delivator

This tool will not work anymore since wayback has changed it page rendering implementation.

Sep 25 '22 04:09 hussainb

This is a duplicate of https://github.com/hartator/wayback-machine-downloader/issues/106

Oct 31 '23 15:10 PiotrDabrowskey

wayback-machine-downloader wayback-machine-downloader copied to clipboard

only downloads an index.html file rather than entire site

wayback-machine-downloader
wayback-machine-downloader copied to clipboard