wayback-machine-downloader icon indicating copy to clipboard operation
wayback-machine-downloader copied to clipboard

Error 503

Open billmrmi opened this issue 1 year ago • 10 comments

When running I get an error 503 during snapshot phase. in the open-uri.rb I have seen in previous forums this was an issue and wait. I have been trying for 3 days. Thnx.

Here is the info:

Getting snapshot pagesC:/Ruby32-x64/lib/ruby/3.2.0/open-uri.rb:369:in open_http': 503 Service Temporarily Unavailable (OpenURI::HTTPError) from C:/Ruby32-x64/lib/ruby/3.2.0/open-uri.rb:760:in buffer_open' from C:/Ruby32-x64/lib/ruby/3.2.0/open-uri.rb:214:in block in open_loop' from C:/Ruby32-x64/lib/ruby/3.2.0/open-uri.rb:212:in catch' from C:/Ruby32-x64/lib/ruby/3.2.0/open-uri.rb:212:in open_loop' from C:/Ruby32-x64/lib/ruby/3.2.0/open-uri.rb:153:in open_uri' from C:/Ruby32-x64/lib/ruby/3.2.0/open-uri.rb:740:in open' from C:/Ruby32-x64/lib/ruby/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader/archive_api.rb:13:in get_raw_list_from_api' from C:/Ruby32-x64/lib/ruby/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:88:in get_all_snapshots_to_consider' from C:/Ruby32-x64/lib/ruby/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:131:in get_file_list_all_timestamps' from C:/Ruby32-x64/lib/ruby/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:158:in get_file_list_by_timestamp' from C:/Ruby32-x64/lib/ruby/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:309:in file_list_by_timestamp' from C:/Ruby32-x64/lib/ruby/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:192:in download_files' from C:/Ruby32-x64/lib/ruby/gems/3.2.0/gems/wayback_machine_downloader-2.3.1/bin/wayback_machine_downloader:72:in <top (required)>' from C:/Ruby32-x64/bin/wayback_machine_downloader:32:in load' from C:/Ruby32-x64/bin/wayback_machine_downloader:32:in

'

billmrmi avatar Nov 18 '23 18:11 billmrmi

I am getting the same now

danest avatar Nov 18 '23 21:11 danest

archive.org recently implemented a rate-limiting feature that blocks connections for clients that try to make too many requests in a short timeframe. Try the fixes implemented in #268 or #266 and see if that works for you .

sww1235 avatar Nov 20 '23 03:11 sww1235

Thanks, I was facing a similar issue, but the patch in #268 allowed me to download the whole website without problems. I suggest the patch should be included in the code asap.

cyberpunkrocker-zero avatar Nov 20 '23 19:11 cyberpunkrocker-zero

@sww1235 - thanks for the fix. Looks promising! Would love to test it, but don't know how to install (on Ubuntu 22:04).

This won't do the trick:

gem 'wayback_machine_downloader', git: 'git://github.com/sww1235/wayback-machine-downloader.git, branch: 'configurable_delay'

Any advice?

Tiptop4792 avatar Nov 21 '23 10:11 Tiptop4792

@Tiptop4792 Download the source code of the branch you need (https://github.com/sww1235/wayback-machine-downloader/tree/configurable_delay). Then, from the source directory run

gem build wayback_machine_downloader.gemspec
gem install [whatever the name of the resulting file].gem

ingvarr777 avatar Nov 21 '23 11:11 ingvarr777

Thanks so much, @ingvarr777!

I had to use gem build, instead of build. Maybe you just missed it. Anyhow, really cool. Thanks!

Do you know a way how to download the source code via git. I had to download the source directory as a zip file, since

git clone https://github.com/sww1235/wayback-machine-downloader/tree/configurable_delay

didn't work.

Also, @sww1235, your fix works so far. Really cool! - Minor issue, the -n option didn't default to 4 seconds as stated in the manual, but ran into an error when run with '-n' only:

wayback_machine_downloader "example.com" -s -n /var/lib/gems/3.0.0/gems/wayback_machine_downloader-2.3.1/bin/wayback_machine_downloader:68:in <top (required)>': missing argument: -n (OptionParser::MissingArgument) from /usr/local/bin/wayback_machine_downloader:25:in load' from /usr/local/bin/wayback_machine_downloader:25:in `

'

Tiptop4792 avatar Nov 21 '23 12:11 Tiptop4792

@Tiptop4792 You're welcome Thanks for the correction, edited my previous post in case someone else needs it. To answer your question: git clone -b configurable_delay https://github.com/sww1235/wayback-machine-downloader.git

And 4 sec is default if you don't mention -n. Like this: wayback_machine_downloader "example.com" -s You need to use -n only if you want something other than 4 sec

ingvarr777 avatar Nov 21 '23 12:11 ingvarr777

Glad my fix worked for you. Now if we could get it merged and released...

sww1235 avatar Nov 21 '23 13:11 sww1235

archive.org recently implemented a rate-limiting feature that blocks connections for clients that try to make too many requests in a short timeframe. Try the fixes implemented in #268 or #266 and see if that works for you .

Thank you ! I've already started looking for other programs. I would also like to see a proxy in your program and an automatic proxy change for different time intervals.

aloneuser avatar Nov 22 '23 18:11 aloneuser

If enter such list of command: wayback_machine_downloader site1.com/file wayback_machine_downloader site2.com/file wayback_machine_downloader site3.com/file then wayback_machine_downloader have 503 error.

aloneuser avatar Nov 28 '23 11:11 aloneuser