discord.js icon indicating copy to clipboard operation
discord.js copied to clipboard

feat: added `@discordjs/sharder` package

Open kyranet opened this issue 3 years ago • 12 comments

Please describe the changes this PR makes and why it should be merged:

I heard you wanted a nice and complete sharder. :eyes:

This sharder draws inspiration from the strategy and plug-in system Sapphire has largely adopted over the months, the way Kurasuta handles messages and shards separately, and the structure of Discord.js's built-in sharding manager to make the update easy, alongside a few sprinkles of my own based on my experience when building a cross-process intercommunication system.

:warning: This PR is under heavy WIP: docs and tests are missing, they'll be done as I progress. The API may also change as new needs arise when building more code.

Let's talk about this sharder's structure! It is divided in three components:

  • The Sharding Manager: it's the central class that handles the shards, from starting them up, to closing.
  • The Shards: they're strategies that define how the "channel" should be handled from the sharding manager, they can be of any kind, from workers (WorkerShardHandler) and forks (ForkProcessShardHandler) to cross-server clusters going thru local clusters (ClusterProcessShardHandler). Yes, you heard that. Although we'll most likely ship only the 3 local ones. You can make your own custom shard strategies.
  • The Messengers: they're also strategies, but they define how data must be serialized and/or deserialized as well as handling message queues, we'll provide RawMessageHandler (data as-is, without converting to a Buffer or a string), JsonMessageHandler (JSON), and BinaryMessageHandler (V8 serder)... You can plug in your own ones, such as the Erlang External Term Format (ETF), YAML, TOML, etc.

This clear separation between responsibilities allow for greater customization of the core functionality of the library, reducing the need to use external libraries.

And coming in the future, we'll probably release The Message Mappers, which give a final touch-up to a message, such as appending a header/footer or even, using OpenSSL to encrypt messages over the network.

Issues to handle:

  • DONE #2882: I'll change respawn: boolean to respawn: number, accepting -1 and Infinity for unlimited respawns, but a number that is decreased by one on each shard until it's ready, in which case it'll reset the counter.
  • #3560: I had an eureka moment here, it'll require a small refactor in Discord.js's WebSocket system but... turns out we can query /gateway/bot in the sharder, only once, and pass the data to the shards during spawn. We could also implement a query system so we can get that data from the ShardingManager without querying Discord more than once a day by caching and updating the result accordingly.
  • #3890: See above.
  • #4095: Status unknown.
  • #5364: The new ready handler should be a lot cleaner and far more efficient, but we'll see.
  • #5581: Can it happen with this sharder? Let's find out.

Todo:

  • [x] Consolidate the sharder's API, make everything more consistent.
  • [x] Make ProcessShardHandler abstract, for the two classes below.
  • [x] Implement ForkProcessShardHandler.
  • [x] Implement CusterProcessShardHandler.
  • [ ] Implement WorkerShardHandler.
  • [x] Implement RawMessageHandler.
  • [x] Implement BinaryMessageHandler.
  • [x] Implement built-in cluster sharding (e.g. 2 processes with 5 shards each to have 10 shards) support.
  • [x] Implement fetchRecommendedShards.
  • [ ] Implement shard client to interact with the shard manager, ShardTunnel? ShardClient?
  • [ ] Type the events.
  • [ ] Add documentation.
  • [ ] Add tests.

Note: This is a port of https://github.com/discordjs/discord.js-modules/pull/87

Status and versioning classification:

  • I know how to update typings and have done so, or typings don't need updating
  • This PR changes the library's interface (methods or parameters added)

kyranet avatar Jan 08 '22 00:01 kyranet

🎉

xhyrom avatar Jan 08 '22 14:01 xhyrom

Is there a reason, why @discordjs/sharder module is using the cluster module instead of child-processes. The Cluster module has some layers built on top of child_processes, which can make it slower. Or has it been chosen, bc it allows using the same port?

meister03 avatar Jan 09 '22 11:01 meister03

I fail to see where you see cluster being the only/forced option to use, @meister03, it's simply a strategy.

If somebody wants to use cluster over child processes, they're free to do so.

kyranet avatar Jan 09 '22 11:01 kyranet

This looks promising as a replacement for old d.js sharder. One thing though, can we implement our op system in this sharder by tapping on ipc messages, and do the same send() reply() fashion like how the old veza does? eg: if we want to send messages our selves from main process to sub process? One reason why I would like a system like this, is because I don't want to use broadcastEval on getting data across my sub process.

Deivu avatar Jan 30 '22 15:01 Deivu

This looks promising as a replacement for old d.js sharder. One thing though, can we implement our op system in this sharder by tapping on ipc messages, and do the same send() reply() fashion like how the old veza does? eg: if we want to send messages our selves from main process to sub process? One reason why I would like a system like this, is because I don't want to use broadcastEval on getting data across my sub process.

You can reply to messages, thats been done with the track and waitforId functions(MessageHandler), when I recognized right.

meister03 avatar Jan 30 '22 21:01 meister03

I think there are a few features from Discordeno that we could learn from and would be a big improvement for discord.js and big bots that use it.

Actually, updates are painful for bots that handle a large amount of shards.

https://github.com/discordeno/discordeno#gateway image

DraftProducts avatar Feb 09 '22 15:02 DraftProducts

A lot of those would go in the @discordjs/ws package, not in sharder, and are actually planned to a degree (for instance you can host the ws in a different process or machine, and in the future, you can make d.js pull from that process using NATS or whatever floats your boat)! I'd also like to clarify that technically there is "downtime" when resharding, as the connections have to close and reopen with an identify payload being sent with the new shard configuration but 🤫

vladfrangu avatar Feb 09 '22 15:02 vladfrangu

I would love to take inspiration of discordeno, if it were any good that is.

Most of that text is just random fluff with no actual improvement or impact, or a complete lack and misunderstanding of how sharding works to make it look more appealing.

iCrawl avatar Feb 09 '22 18:02 iCrawl

Sure @iCrawl, I agree with you. Sadly, like any promotional text, it can happen that the reality is embellished (#LinkedIn), but behind these beautiful words, there is necessarily something concrete and/or a concept that could improve Discord.js. I'm sure that downtime can be optimized. If we use the current ShardingManager spawn system with a large amount of shards, an update of is painful. I am convinced that we can find a way to launch the shards in a more optimized way, if only with the max_concurrency implemented by Discord : The way that I currently use, is a custom class in the ShardingManager that communicate through an external Pub-sub with each Shard to resolve a Promise that block the IDENTIFY request of WebSocketShard class to start each shard with the minimal downtime without a rate limit (thx to @Koyamie for his help for that stuff).

DraftProducts avatar Feb 09 '22 20:02 DraftProducts

A lot of those would go in the @discordjs/ws package, not in sharder, and are actually planned to a degree (for instance you can host the ws in a different process or machine, and in the future, you can make d.js pull from that process using NATS or whatever floats your boat)! I'd also like to clarify that technically there is "downtime" when resharding, as the connections have to close and reopen with an identify payload being sent with the new shard configuration but 🤫

Actually it is close to zero downtime as you only lose a few events, you have the new shard and the old shards stays connected until the new one is ready

Drxckzyz avatar Feb 13 '22 12:02 Drxckzyz

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Updated
discord-js ⬜️ Ignored (Inspect) Jul 15, 2022 at 9:33PM (UTC)

vercel[bot] avatar Jul 15 '22 21:07 vercel[bot]

any updates on this?

phamleduy04 avatar Sep 21 '22 19:09 phamleduy04

Superseded by #8859, the new PR is up-to-date and implements the RFC. Also has a much simpler API.

kyranet avatar Nov 24 '22 15:11 kyranet