wayback-machine-downloader
wayback-machine-downloader copied to clipboard
Images and css missing, only downloads html
So running command wayback_machine_downloader https://fairies.disney.com/tinker-bell -t 20140719225715 -a
results in html file without any images, css and other files and all links point to various CDN subdomains where files no longer exist instade of pointing to and downloading files locally, when opening website on archive.org all images are visible but the downloader only pulls html files
Here is the output after running command `C:\Users\private>wayback_machine_downloader https://fairies.disney.com/tinker-bell -t 20140719225715 -a Downloading https://fairies.disney.com/tinker-bell to websites/fairies.disney.com/ from Wayback Machine archives.
Getting snapshot pages. found 8 snaphots to consider.
1 files to download: http://fairies.disney.com/tinker-bell -> websites/fairies.disney.com/tinker-bell/index.html (1/1)
Download completed in 5.36s, saved in websites/fairies.disney.com/ (1 files)
C:\Users\private>`
I got the same issue. Any update on this please?
same here. I think we need to use some older version. Current version is broken...
same here. I think we need to use some older version. Current version is broken...
I think it's the other way around. This project is several years old and must have broken due to some kind of recent archive.org update.
From what I can tell at a glance, it seems like maybe archive.org did some kind of restructuring to reduce the number of duplicate files on their server, which means that a single snapshot will only get you the most recently changed files, but none of the ones that are identical the last saved copy.
As a workaround, you can download the files from every snapshot and merge them afterwards with a tool of your choice (such as Windows's file explorer):
wayback_machine_downloader http://example.com --all-timestamps --concurrency 5
However, that process is slower and more error prone, so if anyone knows of a better method, that would be great.
Pinging @hartator - It seems like the core functionality of the script might be broken right now.
Hi. Css and images are not missing, theyre also downloading, but many of them have a mistakes in a name, that
s a one problem. The second problem is that file index.html contibue to searching for all files not on the my computer, but on the site, which I trying to save. But this site id down for a few year, so...
All that`s just for me, and sorry for my bad English)
I'm getting the same error here, anyone know of an updated tool that works?
6 months and still no fix for this?
still the same issue