wayback-machine-downloader icon indicating copy to clipboard operation
wayback-machine-downloader copied to clipboard

Images and css missing, only downloads html

Open lucky1804 opened this issue 4 years ago • 7 comments

So running command wayback_machine_downloader https://fairies.disney.com/tinker-bell -t 20140719225715 -a results in html file without any images, css and other files and all links point to various CDN subdomains where files no longer exist instade of pointing to and downloading files locally, when opening website on archive.org all images are visible but the downloader only pulls html files

Here is the output after running command `C:\Users\private>wayback_machine_downloader https://fairies.disney.com/tinker-bell -t 20140719225715 -a Downloading https://fairies.disney.com/tinker-bell to websites/fairies.disney.com/ from Wayback Machine archives.

Getting snapshot pages. found 8 snaphots to consider.

1 files to download: http://fairies.disney.com/tinker-bell -> websites/fairies.disney.com/tinker-bell/index.html (1/1)

Download completed in 5.36s, saved in websites/fairies.disney.com/ (1 files)

C:\Users\private>`

lucky1804 avatar Jan 16 '21 03:01 lucky1804

I got the same issue. Any update on this please?

xixido90 avatar Feb 15 '21 15:02 xixido90

same here. I think we need to use some older version. Current version is broken...

stevemarksd avatar Feb 19 '21 06:02 stevemarksd

same here. I think we need to use some older version. Current version is broken...

I think it's the other way around. This project is several years old and must have broken due to some kind of recent archive.org update.

From what I can tell at a glance, it seems like maybe archive.org did some kind of restructuring to reduce the number of duplicate files on their server, which means that a single snapshot will only get you the most recently changed files, but none of the ones that are identical the last saved copy.

As a workaround, you can download the files from every snapshot and merge them afterwards with a tool of your choice (such as Windows's file explorer):

wayback_machine_downloader http://example.com --all-timestamps --concurrency 5

However, that process is slower and more error prone, so if anyone knows of a better method, that would be great.

Pinging @hartator - It seems like the core functionality of the script might be broken right now.

Pikamander2 avatar Mar 10 '21 13:03 Pikamander2

Hi. Css and images are not missing, theyre also downloading, but many of them have a mistakes in a name, thats a one problem. The second problem is that file index.html contibue to searching for all files not on the my computer, but on the site, which I trying to save. But this site id down for a few year, so... All that`s just for me, and sorry for my bad English)

kumednuy avatar Mar 23 '21 19:03 kumednuy

I'm getting the same error here, anyone know of an updated tool that works?

ZizzyDizzyMC avatar May 23 '21 22:05 ZizzyDizzyMC

6 months and still no fix for this?

lazybearsoft avatar Jul 12 '21 06:07 lazybearsoft

still the same issue

fsacer avatar Oct 09 '21 12:10 fsacer