twitter-scraper icon indicating copy to clipboard operation
twitter-scraper copied to clipboard

Twitter API throttling requests (`429 Too Many Requests`)

Open ToniPortal opened this issue 2 years ago • 10 comments

Hello, when I use certain scraper functions it gives me this error

      status: 429,
      statusText: 'Too Many Requests',

It was mostly to ask do you know when it's going to stop saying that ? Or will it be fixed ?

ToniPortal avatar Jul 07 '23 05:07 ToniPortal

I haven't seen that error from this before, but I'm currently working on porting over a month's worth of commits (#9, #6), so once that's done this might be fixed.

karashiiro avatar Jul 07 '23 06:07 karashiiro

I just released v0.3.0 which ports over the most critical changes, let me know if that helps.

karashiiro avatar Jul 07 '23 20:07 karashiiro

Actually, I just noticed the repo tests are failing with 429, looks like a regular rate limit. I'll need to set up a backoff for this.

karashiiro avatar Jul 07 '23 22:07 karashiiro

Should be fixed with v0.3.1 (#12).

karashiiro avatar Jul 07 '23 23:07 karashiiro

Made a minor adjustment in v0.3.2 btw, v0.3.1's backoff can get stuck in a retry loop for a while sometimes, which the latest version fixes.

karashiiro avatar Jul 08 '23 15:07 karashiiro

Apparently that wasn't what was happening - the API actually just rate-limits the client for 14 minutes after a certain point. I updated the throttling mechanism to handle this, but I'm not sure if there's any real way to handle this beyond the delay.

karashiiro avatar Jul 08 '23 16:07 karashiiro

Oh thanks for the clarification! Really not cool to have put so many limitations on twitter, hope one day they change all these limitations they put in place...

ToniPortal avatar Jul 09 '23 09:07 ToniPortal

Is the restriction on IP address or just the tokens being used? If its the token can you regenerate the auth and carry on? Given we approximately know the limits of each endpoint (50 requests per 15 minutes on tweets/replies) we could count the usage of a token and generate before. I see in the code already it takes into account the token validity over x time so this would be now based on usage too.

Edit: Specifically in regards to guest token usage. I expect using a fully authed account would limit the consumer.

ImTheDeveloper avatar Aug 31 '23 08:08 ImTheDeveloper

I just gave that a try, and it doesn't work - as soon as you try to get a new guest token you get rate-limited.

diff for reference: diff.txt

karashiiro avatar Sep 11 '23 04:09 karashiiro

I added a request timeout, but when it got relate-limited, it hung for a long time and ignored the timeout. This is how I added a timeout:

const scraper = new Scraper({
  transform: {
    request(input: RequestInfo | URL, init: RequestInit = {}) {
      init.signal = AbortSignal.timeout(REQUEST_TIMEOUT);

      return [input, init];
    },
  },
});

I also noticed that there are two types of 429. One of them hangs for ~13-14 minutes and another throws immediately.

  1. I think, the first 429 is related to the endpoint I'm hitting (getProfile), this one is being retried.
  2. Another one is for requesting a new guest token, this one fails immediately. I've noticed that if I create a new instance of Scraper, it fixes the rate-limit requesting a profile. So if we request a new guest token whenever we get 429, that should fix the rate limit. But if requesting the guest token is rate-limited, then it won't help for 14 minutes.

Would it be possible to refresh the guest token when the request for profile (or probably any other request) gets rate-limited instead of waiting for 13-14 minutes?

ethos-vitalii avatar Sep 17 '24 00:09 ethos-vitalii