hikari icon indicating copy to clipboard operation
hikari copied to clipboard

Global ratelimit handling isn't respecting max_rate_limit

Open FasterSpeeding opened this issue 2 years ago • 4 comments

This line relies on a ManualRateLimiter instance which is completely un-aware of max_rate_limit for backing off.

https://github.com/hikari-py/hikari/blob/b33a133c0214970355a2a6afb956a45fd38bfc31/hikari/impl/rest.py#L903-L909

This line which pre-empts global rate-limits also relies on the same ManualRateLimiter instance which is unaware of max_rate_limit

https://github.com/hikari-py/hikari/blob/b33a133c0214970355a2a6afb956a45fd38bfc31/hikari/impl/rest.py#L730

For reference the global rate limit handler is initialised in the following code https://github.com/hikari-py/hikari/blob/b33a133c0214970355a2a6afb956a45fd38bfc31/hikari/impl/rest.py#L466-L469

Ideal Solution

These lines should both raise RateLimitedTooLong if max_rate_limit is violated by the global ratelimit.

FasterSpeeding avatar Oct 10 '21 10:10 FasterSpeeding

image Are these the same issues?

alvarlagerlof avatar Aug 30 '22 14:08 alvarlagerlof

Are these the same issues?

@alvarlagerlof Nope. This issue is for the REST client. The gateway have their own enforced limit of 120 requests per 60 seconds (Discord docs), but that limit should not be hit since they are heavily enforced:

https://github.com/hikari-py/hikari/blob/86f80b926a827ac49c152989ad1db710e764a2d0/hikari/impl/shard.py#L94-L95 https://github.com/hikari-py/hikari/blob/86f80b926a827ac49c152989ad1db710e764a2d0/hikari/impl/shard.py#L566-L575

Discord closing your connection with a 4008 is not something I have seen before. Mind opening an issue for that and proving some more information?

davfsa avatar Aug 30 '22 15:08 davfsa

Discord closing your connection with a 4008 is not something I have seen before. Mind opening an issue for that and proving some more information?

I could, but I don't really have more information to give I am afraid. I have no idea why this happend the there error does not show more than that.

alvarlagerlof avatar Sep 02 '22 21:09 alvarlagerlof

I could, but I don't really have more information to give I am afraid. I have no idea why this happend the there error does not show more than that.

Hikari takes the necessary measures to prevent that from happening, but it if happens again please file an issue with logs attached and any more info you can provide (bot size, shard count, any code that interacts with shards: joining voice, updating presence, requesting chunks) :)

davfsa avatar Sep 03 '22 11:09 davfsa

I am currently debating whether this would even be correct to implement, as this might lead to a whole lot of errors spam, rather than just the bot hanging for the duration of the global ratelimit.

I rather keep it how it is and just document it

davfsa avatar Jan 02 '23 13:01 davfsa

rather than just the bot hanging for the duration of the global ratelimit.

If the bot managed to hit the global ratelimit then won't queueing up a ton of requests to different bucket hashes then trying to exhaust every bucket the moment the global ratelimit ends have the potential to just make the problem worse.

FasterSpeeding avatar Jan 02 '23 14:01 FasterSpeeding

Also having it raise errors would be a better indicator of what the problem is (i.e. a bot's dev has messed up or needs to request a higher global ratelimit from Discord if they're that big a bot) than just silently doing nothing for ages before burst requesting would be.

FasterSpeeding avatar Jan 02 '23 14:01 FasterSpeeding

Also having it raise errors would be a better indicator of what the problem is (i.e. a bot's dev has messed up or needs to request a higher global ratelimit from Discord if they're that big a bot) than just silently doing nothing for ages before burst requesting would be.

The problem I see with this is that this would lead to a lot of requests to be lost, which for a big bot could lead to a lot of wrong state on Discord. I know we document RateLimitTooLongError and what this would mean, but you dont account for it ever happening.

Additionally, now that i think about it, a huge burst of requests might not be the best thing, because it will continuously hit the 50/s (initial) cap and could even lead to a ban from the API, so quite torn on it.

It has gotten a point of "what is the worse posion" and i have no clue which one to choose.

davfsa avatar Jan 02 '23 14:01 davfsa

I know we document RateLimitTooLongError and what this would mean, but you don't account for it ever happening.

You also don't usually account for processing and replies taking ages and finishing executing after the command call has gotten stale just because you got globally rate limited.

I assume the global IP ratelimit would also effect this and those are pretty common for ppl using cheap hosts which share IPs and tend to be the kind of ppl who need an explicit error.

FasterSpeeding avatar Jan 02 '23 14:01 FasterSpeeding

Fair enough. Should be easily implementable anyways

davfsa avatar Jan 02 '23 14:01 davfsa