slack-edge icon indicating copy to clipboard operation
slack-edge copied to clipboard

`auth.test` is rate-limited on certain burst events

Open StephenTangCook opened this issue 9 months ago • 4 comments

Hello, We found that the framework's usage of auth.test causes errors in certain scenarios where a burst of events trigger the method's rate limiting:

This method allows hundreds of requests per minute.

For example, we found this commonly to happen when subscribed to the member_joined_channel event and a burst of events comes in. We assume this is due to a workspace creating a new channel and then automatically inviting the whole workspace or a large number of users to the channel at once, which could easily be in the thousands.

The full error is:

"AuthorizeError: Failed to authorize (error: SlackAPIConnectionError: Failed to call auth.test (cause: SlackAPIConnectionError: Failed to call auth.test (status: 429, body: 
{\"ok\":false,\"error\":\"ratelimited\"})), query: {\"isEnterpriseInstall\":false,\"teamId\":\"T0XXXXXXXX\",\"userId\":\"U0XXXXXXXX\"})"

Do you have recommendations on how to handle this? TIA

StephenTangCook avatar Apr 15 '25 16:04 StephenTangCook

Hi @StephenTangCook, thanks for sharing this. Indeed, this could happen even with Bolt frameworks... One viable solution is to have a short time cache for the API call, and it should not cause any issues. If you're using Cloudflare KV based installation store, this part needs the enhancement with an opt-in configuration (say, authTestCacheEnabled: boolean -- default is false): https://github.com/slack-edge/slack-cloudflare-workers/blob/1.3.2/src/kv-installation-store.ts#L107 If you're fine to work on it, I am happy to review it and ship a new version with it. Otherwise, I can make the change when I have time for it.

If an app needs to perform auth.test for those users' user tokens, the only possible way to mitigate this would be having auth.test API response cache in database.

seratch avatar Apr 15 '25 21:04 seratch

Ah I actually implemented this with expiration in Java SDK five years ago... https://github.com/slackapi/java-slack-sdk/blob/v1.45.3/bolt/src/main/java/com/slack/api/bolt/middleware/builtin/MultiTeamsAuthorization.java#L275-L291

seratch avatar Apr 15 '25 21:04 seratch

@seratch I see, that's a good idea. I might have time this week to give a crack at it. Thanks for the java reference!

Let me make sure I have the gist right. In slack-cloudflare-workers, we would:

  1. Add new options for authTestCacheEnabled (default false) and authTestCacheExpirationMillis (default 10 min)
  2. Add a new KV namespace for the cache SLACK_AUTH_TEST_RESPONSE (or should we re-use an existing?), which will store (token -> auth.test response) with a KV ttl set to authTestCacheExpirationMillis
  3. If authTestCacheEnabled is true, then we check the cache first for the bot client auth test

Although I'm not clear on slack-edge -- I see the slack-edge also has some calls to auth.test (search results), such as in handleOAuthCallbackRequest and singleTeamAuthorize. Are these calls we should also incorporate into a cache?

StephenTangCook avatar Apr 17 '25 15:04 StephenTangCook

@StephenTangCook Thanks for the idea and when it comes to the CF KV store implementation, I totally agree with your plan. We can add a new KV SLACK_AUTH_TEST_RESPONSE and have an opt-in flag to enable it.

Regarding the non-OAuth app use cases, perhaps, having a simple interface (like a callback function) to enable developers to pass cache data retrieval layer when initializing an App instance. If you're interested only in the KV implementation, I am happy to look into this side more deeply (when I have time for it).

seratch avatar Apr 22 '25 03:04 seratch