discord.js Roadmap: re-think the `Client#ready` event

Roadmap: re-think the `Client#ready` event

Open didinele opened this issue 1 year ago • 1 comments

As brought up internally as of recent, we should potentially re-evaluate how Client#ready works for TSR.

For the uneducated on Discord gateway intricacies, here's the connection flow:

Establish a WS connection to the gateway
Wait for Discord's HELLO event, which is also when we start sending heartbeats
Send an IDENTIFY payload of our own, which tells Discord who we are (we pass in our bot token and some shard information)
Discord replies with a READY payload, which includes some initial state like the current client user

In this step, Discord also tells us what guild ids our bot (our rather, the current shard, but I'm keeping things simple) needs to handle.

Subsequently, we get GUILD_CREATE payloads to be able to lazily populate cache(s)

Rather unintuitively, current mainlib only fires Client#ready once step 5 is done - or potentially after a timeout in the event some of our guilds aren't coming through (which makes us assume it's unavailable/in an outage).

I have 2 key issues with this:

Unlike just about any event at this point (the only other exception that comes to mind being userUpdate/guildMemberUpdate), we aren't in-line with Discord's behavior for no good reason.
Considering that with TSR all cache will be optional, there's good reason to re-think how we approach this to better fit all use cases - I'm rather unhappy with just telling someone that does not want guild cache to just "disable the ready timeout".

My proposed solution is as follows:

Make Client#ready reflect Discord's actual READY event
Think smarter in terms of our structure interface design. It's rather unclear how those will look right now, so I'm going to pseudo-code here to give an idea of what I roughly envision in terms of behavior:

client.on(Events.MessageCreate, (message) => {
  if (!message.inGuild()) {
    return null;
  }

  // unclear what the cache APIs will look like, but assuming some sort of abstract KV-like interface on the client
  let cached = await client.guilds.cache.get(message.guildId);
  // unclear what our current stance on partials are, but `cached` would be Guild | PartialGuild for the sake of this example
  if (!cached || cached.isPartial()) {
     cached = await client.guilds.fetch(message.guildId);
  }

  // from here on out, we run some processing that requires a few properties from Guild, such as name and icon
});

So, let's look at the different configuration cases and how this would behave in each one:

No guild cache, no guild partial:
- That if statement becomes completely redundant, more on this later
Guild cache is opted into, but the guild partial is disabled:
- This case is rather self explanatory.
- When we encounter the guild for the first time internally, we cache it as a PartialGuild with just an id and unavailable: false
- Once more, we see a bit of runtime redundancy with the if statement, !cached would only be true with cache disabled or if the user was sweeping their guild cache, but I'd say it's better than to encourage users to assert cached!, considering at any point a change in their caching config could make that assertion completely invalid, and they'd have no compile-time help in tracking that down.
- await client.guilds.fetch(message.guildId); just makes an HTTP call to GET /guilds/:id as expected
Guild cache is opted into, guild partial is enabled:
- Assuming that isPartial() in our code example yields true, this means we got this event before the guild came through in GUILD_CREATE. IME, and technically speaking, I actually doubt this case can really occur, but Discord never documents it as such, so I do strongly believe that it's correct to assume we could be getting other events as the GUILD_CREATEs come through
- In this case, during the initial READY payload, we construct & cache a PartialGuild as always
- cached.isPartial() is true, so we run await client.guilds.fetch(message.guildId); which just makes an HTTP call to GET /guilds/:id

Alternatively for the third case, albeit much more jank-sounding:

PartialGuild could have some sort of pending property that we set to true only if the guild is included in READY's guilds
PartialGuild#fetch picks up on this internally and and simply waits for the appropriate GUILD_CREATE event. The client will store a timestamp for when we got READY, so this will only be waiting for timeout - timestamp before throwing an error about the guild being unavailable - much like how GET /guilds/:id would've yielded an error, and subsequently set this.unavailable to true.

Yes, my example is heavily reliant on rather hypothetical API design that most certainly has drawbacks for various other reasons, what I'm trying to get across here is that our handling of the gateway connection flow (and to an extent our structure design around Guild) should hit those following core ideas:

Runtime safety should be easy to accomplish for the user - more specifically, they should be able to write event handling code with minimal redundancy that will just work no matter what cache/intent options they pass into their client.
Actually in-line with Discord regarding READY handling, no more GUILD_CREATE timeout madness.

Other concerns such as type-safety and handling of partial data are a bit out of scope for what I wanted to get across here, but how we solve those could heavily constrain how we approach this issue.

Mar 30 '23 15:03 didinele

Could the Client be a different type based on the intents being passed in, so the emitted events could "know" that it was a partial, and the emitted types are essentially inferred based on the intent. Not sure if that would be too convoluted or solves the exact problem this issue is facing

Mar 30 '23 15:03 JMTK

discord.js discord.js copied to clipboard

Roadmap: re-think the `Client#ready` event

discord.js
discord.js copied to clipboard