discord-api-docs
discord-api-docs copied to clipboard
Random misleading Unknown Interaction errors
Description
I've seen this issue reported by many people but so far no one has been able to gather enough information to reliably explain what's going on. An example can be seen at https://github.com/discordjs/discord.js/issues/7005
In summary, every now and then at a seemingly random chance it's possible that a bot's reply to an interaction fails due to an Unknown Interaction when, in reality, the reply succeeded and was shown to the user (by reply I mean a regular reply, deferred reply or update). I know this because I've been investigating this issue on a bot I manage for around a week now and I asked some users who were impacted by this.
In the following screenshots I'm logging the time it took for me to reply by subtracting the current timestamp to the interaction's created_timestamp, and then logging the time it took for the bot to receive the error by subtracting the timestamp at the time the error was received to the one before the request was submitted. You can see that the reply is sent pretty fast and in time for Discord to accept it, however, the error comes 5 seconds later, indicating some sort of issue on Discord's end.
And of course I could be faking those numbers but it would make no sense for me to do that so I'm gonna have to ask you to trust that.
I later asked the user impacted by this issue to see what the bot responded with, and they showed that the reply was indeed deferred, which means that that error was a false positive and everything worked fine on our end.
Steps to Reproduce
There are no steps to consistently reproduce this issue as it only happens randomly. What I can tell is that the error comes when the API takes too long to send the response back but actually acknowledges and processes it.
Expected Behavior
The reply is sent correctly (happening) and a success message is returned
Current Behavior
The reply is sent correctly but an "Unknown Interaction" error is thrown
Screenshots/Videos
Can only attach what I've shown above already
(Bot is thinking but in Portuguese)
Client and System Information
discord.js v14.6.0 on Node v18.11.0 running on Debian 11 (bullseye)
Can you provide a code snippet showing how these logs were generated? I'm curious as to whether it is all coming from one request or possibly retries. There are a number of interacting systems here, so additional information to help debug the issue would be beneficial.
afaik discord.js only retries to submit requests when getting a 429 response which I assume not to be the case here on freshly created interactions, so there are no retries being done here to my knowledge
This is the first line that gets executed in the entire event that isn't an if statement and nothing above it interacts with the API other than this line. Hope this helps
Keep in mind this issue can happen with other kinds of interaction replies, not only deferred messages (I tested with showing a modal). I only showed that snippet because it's the most basic one that should never generate that error
Hey @DV8FromTheWorld do you have any updates on this?
I have been getting this too, we check if it has been 3 seconds and it definitely has not at the time of request, but sometimes get this response.
[email protected]
Yeah, I have been getting this error as well. And I haven't changed my code since updating Discord.js to v14.6.0.
I have not looked deeper into this issue at this time. This is the first time I've heard of this issue. Before assuming it is a problem with Discord I would likely investigate the underlying library implementation.
For debugging purposes: Is there a way in your library (or tech stack) to track outbound network traffic? If there is, it would be useful to indicating whether the library is re-attempting a network call or if the initial network call is actually taking 5 seconds. From your code snippet that isn't possible to determine.
I’m not sure if there is but I can dig into the source code and add that myself. I do, however, doubt that is the case, as we’ve seen @ooliver1 say they are experiencing the same behavior and they’re using a python library, which is completely different from the one I’m using
It's pretty hard to reproduce confidently, since it's been random a lot of the time
I have found that letting it sit running for multiple hours after starting the bot allows it to not have that error until you turn the bot off and try to run it again without letting it sit.
@DV8FromTheWorld I believe there is not much more debugging I can do here. Due to this issue happening at a random chance and requiring a high volume of interactions it would be impossible to gather enough data to be able to tell exactly why it's happening. All I can tell is that, on discord.js, after calling deferReply() the request is sent to this method which I am not familiar with and I would probably need to spend a lot of time figuring out all the quirks with this class and the whole package itself. I would, however, like to emphasize that I've seen people face this issue long before discord.js had this rest package, and also other people on other languages and libraries claim to be facing the same so could you look into this? If needed I can start gathering timestamps of when this issue happens and send them to you if that helps, I just can't log anything from the internal parts of the library unfortunately
If it helps, I also get totally random, out of nowhere, “unknown interaction” errors in my bot logs [i run a bot using the discord.py library, so totally unrelated to the op who uses discord.js] when sending a response to an interaction. In my case, its just an immediate ephemeral response message [eg interaction.response.send_message(mymsghere, ephemeral=True) ], rather than a deference response with a use of the followup webhook.
I’ve never bothered trying to work out why it happens since the error traceback shows its more likely to be a Discord issue, rather than to do with anyones’ library implementations [unless every single lib dev has implemented interactions wrongly for 2 years lol]. Also, it’ll happen once, then never again for several days, usually when i’m sleeping [ie overnight] so its hardly something i can spend time debugging, since there’s no chance i’ll be able to find out why its happening.
I'm facing some issues with showing Modal
in my bit. The same code works 99% times but in some cases the interaction returns Unknown Interaction
when trying to show the Modal
. When I replace Modal
with a Reply
to the interaction, it works everytime. But as soon as I revert back to using Modal
it starts failing again. This happens in certain buttons interactions set through certain slash command data. The issue persists even if I repost that post. But if I try posting it again with same data, the error persists.
Have been getting random unknown interaction errors as well on deferReply()
and showModal()
rarely. Decided to check how much time each reply is taking (even though everything is deferred) using console.time()
and console.timeEnd()
and surprisingly that one day no errors occurred.
Would someone with this issue be willing to provide a complete, runnable code sample that reproduces this issue? Its very difficult to figure out if this is even a bug or not
@yonilerner like I've said above, simply set up an event listener that all it does is either reply or defer the interaction it receives. Let that sit for a couple hours with a good amount of interactions coming through and you should see the error. There's no reproducible code sample because it really is random
event listener that all it does is either reply or defer the interaction it receives. Let that sit for a couple hours with a good amount of interactions coming through and you should see the error. There's no reproducible__
The problem here is that there isn't enough information here to actually debug anything. I recognize that people are occasionally receiving "Unknown Interaction", but that usually indicates a problem with the developer's code.
Personally, I would try capturing a variety of information:
- Capture network logs.
i) Ensure there are no retries
ii) Ensure the network request is actually being sent to discord, as opposed to being queued for # seconds due to some ratelimiting, and thus exceeding the timelimit
- I throw this bullet point in because in the screenshots I'm seeing multi-second delayed until receiving an error which, to me, indicates a ratelimiter is holding things up. The fact that the error came from the "sequential requester" further makes me think that is at play
- Time the event was received
- Time the the event was supposedly responded to
- The time the request was actually sent by the network requester
- The type of event response (deferred reply, etc)
- Information about the internal ratelimiter to see if anything was triggered
- Generally I'd turn on any debug-logging around the network layer / requester
Unfortunately, until we have better concrete information with a timeline of events in a failed interaction request there isn't a ton we can do here.
Alright thank you, I will try to get that information for you. Unfortunately it might not be very easy since my bot is using a package and it's hard to get that info from the package itself on prod, but I'll look into it
For what it's worth, with the increasing number of times we've seen this, I decided to finally look into a bit. In djs there shouldn't be anything getting in the way of the request firing, but I am implementing a separate request handler to handle specifically interaction callbacks. While in theory this won't change the external facing behavior of the request, it at least should streamline the process and make it a little easier to debug.
Been a few months here so I'm assuming the behavior isn't being seen anymore.
Been a few months here so I'm assuming the behavior isn't being seen anymore.
oh no it definitely is, every single day, multiple times, I just don't have the time nor patience to debug things to the level you guys asked for
Same here, has become a part of my life now.
Been a few months here so I'm assuming the behavior isn't being seen anymore.
Still happens, even had it earlier today lol. Its just something I'm used to seeing at random now, and haven't really bothered to care about since there's no immediate impact to my bot. That being said, the source of what causes this problem needs to be resolved so people aren't confused by random misleading errors.
I'm excited to find this issue! This has been very annoying the past few weeks. Some of my findings:
- Like ImRodry has said: it can be fine for hours at a time and then will happen a bunch in a row. It's hard to reproduce, but it's happening. Repeating the same command as second time usually works fine.
- I'm been using Discord.js 14.7.1, and now using 14.8 and can reproduce the error on my bot eventually with enough tries.
- I moved my bot to a Linode with a dedicated CPU and 4GB of ram, still got the error. Unless it's an issue with Linode (a primary VPS provider), it's not a hardware issue or a shared CPU issue.
- It doesn't matter if it's a slash command or a button. It just seems to pick an interaction response and fail to run it in time
- In frustration of this issue, all my interactions now deferReply as soon as possible and it feels it doesn't matter how quickly the deferReply happens, it will sometimes just take longer than 3 seconds to respond
- I added some logs to see how long each part of the code takes and it's all super fast until it will randomly hang on the deferReply. There are no promises running before the deferReply and you can see that it only takes a few ms to get to the deferReply
The architecture of my bot commands:
- interactionCreate recognizes interactions and decides what kind of interaction needs to be run
- commandRun runs slash commands
- d.botstats is a very simple command with no promises
Example 1, this works fine: 2023-03-14 15:55:25.697 [INFO] [interactionCreate] interactionCreate event started at 1678827325696 2023-03-14 15:55:25.698 [INFO] [interactionCreate] Decided to run slash command in 1ms 2023-03-14 15:55:25.699 [INFO] [commandRun] commandRun started at 1678827325698 2023-03-14 15:55:25.700 [INFO] [commandRun] Executed the command in 1ms 2023-03-14 15:55:25.700 [INFO] [d.botstats] Command started at 1678827325700 2023-03-14 15:55:25.702 [INFO] [d.botstats] Attempting to defer reply... 2023-03-14 15:55:25.918 [INFO] [d.botstats] Reply deferred in 218ms`
Example 2, happened right after example 1. Note how it takes < 10 milliseconds to get to the point where it tries to defer reply, and then fails
2023-03-14 15:55:26.979 [INFO] [interactionCreate] interactionCreate event started at 1678827326973
2023-03-14 15:55:26.980 [INFO] [interactionCreate] Decided to run slash command in 6ms
2023-03-14 15:55:26.981 [INFO] [commandRun] commandRun started at 1678827326980
2023-03-14 15:55:26.981 [INFO] [commandRun] Executed the command in 1ms
2023-03-14 15:55:26.982 [INFO] [d.botstats] Command started at 1678827326981
2023-03-14 15:55:26.984 [INFO] [d.botstats] Attempting to defer reply...
2023-03-14 15:55:31.401 [INFO] [commandRun] ERROR: DiscordAPIError[10062]: Unknown interaction
at SequentialHandler.runRequest (/usr/src/app/node_modules/@discordjs/rest/src/lib/handlers/SequentialHandler.ts:498:11)
at runMicrotasks (
I would try upgrading discord.js, there may be some bugfixes in newer versions that resolve this issue
Thanks for the suggestion!
Discord.js 14.8, the latest version, was released on Sunday, two days ago. I hoped it would help, so I upgraded quickly, but it still happens a few dozen times daily. To clarify: when 14.8 was released I updated all my packages and this did not resolve the problem.
I have considered moving down to 13.14 but have yet to do that as it would be a lot more work, and I've not heard any guarantee that version doesn't have this issue. =/
I would move down to 13.14 if it were a sure shot because this error is highly annoying to users. Mod commands sometimes don't work on the first try, so it makes the entire bot seem unstable.
As others have mentioned, this issue is happening across multiple libraries. Both d.js 14.8 and 13.14 should handle this exactly the same. The only notable way to stall an interaction callback at the moment is to have hit the global ratelimit (which is technically an implementation issue that never got updated), and even if you did, that would clear in no more than 1 second. cc @yonilerner
For @LunaUrsa, there were no fixes made relating to this issue in 14.8, though it would've been ideal to land that PR I mentioned earlier for it. We ended up getting really conflicting responses from devs on how "ratelimiting" works on the /callback endpoint so it stalled the PR for a while. At this point I think we are finally ready to move forward with it, so it should land in the next release, but unless you are hitting the global ratelimit it shouldn't actually affect you.
In the interest of +1'ing this issue to highlight it is most definitely not library specific
I have encountered this in D.py, NAFF and interactions.py (rewrite and non). This is most assuredly an issue on discords side.
While I appreciate that it is an absolute nightmare to debug due to the infrequency and randomness of the error, it really shouldn't be brushed away as a library or network issue on the bot developers side.
The only reason this has little outcry is because it's infrequent, and our users just retry the command after it "fails", but obviously that's terrible ux
My two cents to the conversation which I have tried to provide through other means to no luck, it seems like the underlying issue might be (educated guess) Discord taking too long to process the interactions at times, unsure of what that might be due to, as I cant debug there any further. The reason I say this because of logs I have from people using our library like the following one (note logs from an old version of the lib, I haven't contacted the person for new ones, but i have been told it keeps happening, rarely, but happening):
T 2023-01-13 01:52:39,600 hikari.gateway.2: dispatching INTERACTION_CREATE with seq 16296
T 2023-01-13 01:52:39,999 hikari.rest: f640d5af92e411ed85428e896c5c2a03 POST https://discord.com/api/v10/interactions/1063274348967886899/aW50ZXJhY3Rpb246MTA2MzI3NDM0ODk2Nzg4Njg5OTpZNXZtVHgwY25NYTI2bzF2VzlFcm9VbGhHYUF5Z1MxT2xYOGR1Y0MzRGx6WW85clNuSmp1Um1kYU01SlBWbHpWMnFIaVB1WG56bmtSbTFBNjY4VEs5TlpPTVV0cVk5ZTVkbzI4TmhYR0VaMkxkNW1nT2M3ZlFiWjBYdnZucjlOVA/callback
User-Agent: DiscordBot (https://github.com/hikari-py/hikari, 2.0.0.dev115) Nekokatt AIOHTTP/3.8.1 CPython/3.10.9 Linux 64bit
{'type': <ResponseType.MESSAGE_CREATE: 4>, 'data': {'embeds': [{'title': 'Stopwatch Started!', 'color': 11814356, 'footer': {'text': 'Note: stopwatch will stop after 1 day.'}}], 'allowed_mentions': {'parse': []}}}
T 2023-01-13 01:52:43,719 hikari.rest: f640d5af92e411ed85428e896c5c2a03 404 Not Found in 3719.914702931419ms
Date: Fri, 13 Jan 2023 01:52:43 GMT
Content-Type: application/json
Transfer-Encoding: chunked
Connection: keep-alive
strict-transport-security: max-age=31536000; includeSubDomains; preload
Via: 1.1 google
Alt-Svc: h3=":443"; ma=86400, h3-29=":443"; ma=86400
CF-Cache-Status: DYNAMIC
Report-To: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=0DHran7ZDY%2Fmf%2F1yZ6lS6PRfwaBRa6jcF61CffN1Re7AS91smDBuc1b12qftsF5I691eJ91iABum2CdkepDgU00BAmjPiD8DJJt57yxDtasX3tEsfVzspF6KHGJV"}],"group":"cf-nel","max_age":604800}
NEL: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
X-Content-Type-Options: nosniff
Set-Cookie: __cfruid=e11b5bb3826dc2e0d77c6aff187aec94fad6c899-1673574763; path=/; domain=.discord.com; HttpOnly; Secure; SameSite=None
Server: cloudflare
CF-RAY: 788a7e8009bb0e44-AMS
Content-Encoding: gzip
{"message": "Unknown interaction", "code": 10062}
Important things to note about the logs:
- The "dispatching" log is right after we receive the interaction (can also be checked by the time in the interaction ID: 1063274348967886899 => 2023-01-13, 01:52:39)
- The
3719.914702931419ms
response time is round-trip time. This includes from making the request (after evaluation of bucket ratelimits, which are skipped for interactions anyways, so a NOOP) to receiving the response. The code can be found here - A
CF-RAY
is provided in the response headers that could allow for further debugging, but these logs are months old and the info might not be stored anymore. I could try to ask for newer logs if it deemed necessary. - This might also be due to random network delay, but I cant tell for sure unless the
CF-RAY
is looked at, as cloudflare should have all that info available. The average response time for this bot before and after these logs are around 500-700ms