wayback-machine-downloader Implement Net::HTTP to resolve rate limiting

This is all based on https://github.com/hartator/wayback-machine-downloader/issues/267#issuecomment-1868090089 and @ee3e's work.

This resolves all rate limiting issues without the need of any delays/sleeps.

I am not sure that the http.finish() line in get_raw_list_from_api is in the correct place, so any code review would be helpful.

Regardless, I thought I'd submit this to try to resolve several of the issues that have come up lately.

Legitimately all credit should go to @ee3e for their solution. This helped me download a ridiculously large backup without issue (452831 files.)

(Issues) Resolves #277, resolves #275, resolves #273, resolves #269, resolves #267

(Pull requests) Resolves #268, resolves #266, resolves #262 (at least according to comments)

Feb 08 '24 05:02 ShiftaDeband

awesome. working fine ! but i'm interested into why the use of Net::HTTP overcomes the rate-limiting. Do you have any idea what the initial problem was?

Feb 14 '24 12:02 bitdruid

awesome. working fine ! but i'm interested into why the use of Net::HTTP overcomes the rate-limiting. Do you have any idea what the initial problem was?

Essentially we're using the same persistent HTTP session to download the whole thing (both snapshots and pages) and keeping it open until it's complete rather than opening/closing several sessions, which the Wayback Machine doesn't like (even if you're using a legitimate browser!).

Mar 01 '24 22:03 ShiftaDeband

Until this gets merged and released can you provide instructions for a non-ruby person to run this branch?

May 18 '24 14:05 greggles

Until this gets merged and released can you provide instructions for a non-ruby person to run this branch?

There are instructions in #281

May 18 '24 14:05 greggles

Until this gets merged and released can you provide instructions for a non-ruby person to run this branch?

because this project had no updates for the last 3y now i've written a replacement in python for my needs... seems dead

May 18 '24 15:05 bitdruid

Finished a 3,000,000 snapshot download thanks to this. Much appreciated.

Aug 23 '24 14:08 tlorien

wayback-machine-downloader wayback-machine-downloader copied to clipboard

Implement Net::HTTP to resolve rate limiting

wayback-machine-downloader
wayback-machine-downloader copied to clipboard