gallery-dl icon indicating copy to clipboard operation
gallery-dl copied to clipboard

[Twitter] Skip scanning to prevent frequent rate limit?

Open Keklul404 opened this issue 1 year ago • 6 comments

Hi, I try to download all files from a twitter account with a lot of entries (already downloaded 1146 files), and I get frequent rate limit, with sometimes, no options to continue downloading after said time have passed from gallery-dl.

This force me to start another instance of gallery-dl and scan again the whole account, which mean, if I just want to start downloading the 1147th file, I will need to pass over 6 or 7 rate limit procedure even though nothing have been downloaded, since a file scanned (but not downloaded because skipped due to the archive setting in gallery-dl), is still considered like a request from twitter.

This mean that trying to download the rest of a twitter account with so much entries almost impossible unless spending hours and hours rescanning the same files over and over and over again.

That's why I would like to know if there is a better way to skip all that? Shouldn't the archive system allow to skip the download and the scan?

Here my setting:

{
    "extractor": {
        "twitter": {
            "username": "redacted",
            "password": "redacted",
			"archive": "- archive-twitter.sqlite3"
        }
    }
}

Adding "skip": "abort:20" doesn't help because it stop everything.

Keklul404 avatar Jan 29 '24 13:01 Keklul404

Can you use the --range option to skip over 1146 files and perhaps evade the rate limit that way?

patrikalienus avatar Jan 29 '24 19:01 patrikalienus

I tried but it didn't seem to work, I added the "skip": "true" and "range": "0-10", then "range": "10" as a test, but still, impossible to skip any files, it always "scan" the same files over and over again. Right now, I can't even finish the whole account anymore, I get too much rate limit when reaching 1000+ list.

I really don't know what to do to force a skip on those "scan" request.

Keklul404 avatar Jan 29 '24 19:01 Keklul404

It'll always start with the latest file

a84r7a3rga76fg avatar Jan 29 '24 22:01 a84r7a3rga76fg

It'll always start with the latest file

Unfortunately yes, which mean that when you need to download 6000+ files, you literally never see the end of it.

I also tried to put a gigantic sleep (20s after each request and download) to make sure to never get a rate limit and finally be able to get to download everything in 2 days. Still, no luck, for some reason, gallery-dl stop at some point without any error nor rate limit, of course, way before even reaching 200+ files....

Keklul404 avatar Jan 30 '24 04:01 Keklul404

What version of gallery-dl are you using and what URL are you giving it?

Fukitsu avatar Jan 31 '24 01:01 Fukitsu

What version of gallery-dl are you using and what URL are you giving it?

I'm using v1.26.6 and this is the URL

Keklul404 avatar Jan 31 '24 06:01 Keklul404