`auth.test` is rate-limited on certain burst events
Hello,
We found that the framework's usage of auth.test causes errors in certain scenarios where a burst of events trigger the method's rate limiting:
This method allows hundreds of requests per minute.
For example, we found this commonly to happen when subscribed to the member_joined_channel event and a burst of events comes in. We assume this is due to a workspace creating a new channel and then automatically inviting the whole workspace or a large number of users to the channel at once, which could easily be in the thousands.
The full error is:
"AuthorizeError: Failed to authorize (error: SlackAPIConnectionError: Failed to call auth.test (cause: SlackAPIConnectionError: Failed to call auth.test (status: 429, body:
{\"ok\":false,\"error\":\"ratelimited\"})), query: {\"isEnterpriseInstall\":false,\"teamId\":\"T0XXXXXXXX\",\"userId\":\"U0XXXXXXXX\"})"
Do you have recommendations on how to handle this? TIA
Hi @StephenTangCook, thanks for sharing this. Indeed, this could happen even with Bolt frameworks... One viable solution is to have a short time cache for the API call, and it should not cause any issues. If you're using Cloudflare KV based installation store, this part needs the enhancement with an opt-in configuration (say, authTestCacheEnabled: boolean -- default is false): https://github.com/slack-edge/slack-cloudflare-workers/blob/1.3.2/src/kv-installation-store.ts#L107 If you're fine to work on it, I am happy to review it and ship a new version with it. Otherwise, I can make the change when I have time for it.
If an app needs to perform auth.test for those users' user tokens, the only possible way to mitigate this would be having auth.test API response cache in database.
Ah I actually implemented this with expiration in Java SDK five years ago... https://github.com/slackapi/java-slack-sdk/blob/v1.45.3/bolt/src/main/java/com/slack/api/bolt/middleware/builtin/MultiTeamsAuthorization.java#L275-L291
@seratch I see, that's a good idea. I might have time this week to give a crack at it. Thanks for the java reference!
Let me make sure I have the gist right. In slack-cloudflare-workers, we would:
- Add new options for
authTestCacheEnabled(default false) andauthTestCacheExpirationMillis(default 10 min) - Add a new KV namespace for the cache
SLACK_AUTH_TEST_RESPONSE(or should we re-use an existing?), which will store (token->auth.testresponse) with a KVttlset toauthTestCacheExpirationMillis - If
authTestCacheEnabledis true, then we check the cache first for the bot client auth test
Although I'm not clear on slack-edge -- I see the slack-edge also has some calls to auth.test (search results), such as in handleOAuthCallbackRequest and singleTeamAuthorize. Are these calls we should also incorporate into a cache?
@StephenTangCook Thanks for the idea and when it comes to the CF KV store implementation, I totally agree with your plan. We can add a new KV SLACK_AUTH_TEST_RESPONSE and have an opt-in flag to enable it.
Regarding the non-OAuth app use cases, perhaps, having a simple interface (like a callback function) to enable developers to pass cache data retrieval layer when initializing an App instance. If you're interested only in the KV implementation, I am happy to look into this side more deeply (when I have time for it).