Kurasuta icon indicating copy to clipboard operation
Kurasuta copied to clipboard

Sharding on multiple servers

Open MrErikCodes opened this issue 5 years ago • 27 comments

Is it possible to shard the bot on multiple servers? As of now it is overlapping itself. Any code examples?

MrErikCodes avatar Jul 22 '19 13:07 MrErikCodes

Yeah it is, though I don't know if Kurasuta's docs allow it, the underlying message system (powered by veza) allows this .connectTo({ host: '255.255.255.255', port: 9999 }), which would stablish a connection with 255.255.255.255:9999.

kyranet avatar Jul 22 '19 13:07 kyranet

Any code examples or link to docs regarding the comment over? Cant seem to find it anywhere. I know discord.js (master) you can use shardList to give it a array to spawn

MrErikCodes avatar Jul 22 '19 17:07 MrErikCodes

Actually, it's quite tricky... the processes are more than capable to connect to other servers, but I don't know about spawning them... I invoke you, @DevYukine

kyranet avatar Jul 22 '19 18:07 kyranet

This would acutally be very usefull, especially for large Music Bots

PLASMAchicken avatar Aug 25 '19 13:08 PLASMAchicken

Will this be added? If so when? Any ETA?

MrErikCodes avatar Oct 14 '19 06:10 MrErikCodes

Yes, this will be added. No ETA at the moment.

kyranet avatar Oct 14 '19 09:10 kyranet

Nice!

owenselles avatar Oct 14 '19 11:10 owenselles

Any update here? Could really need it

MrErikCodes avatar Nov 08 '19 08:11 MrErikCodes

Any update here? Could really need it

It's an OpenSource Project so you can make a PR if you want progress.

PLASMAchicken avatar Nov 08 '19 08:11 PLASMAchicken

supporting this idea!

Roki100 avatar Feb 27 '20 22:02 Roki100

Any update on this one?

anna-rmrf avatar May 10 '20 12:05 anna-rmrf

It's an OpenSource Project so you can make a PR if you want progress.

PLASMAchicken avatar May 10 '20 12:05 PLASMAchicken

i need this :)

muraatydn avatar May 27 '20 00:05 muraatydn

It's an OpenSource Project so you can make a PR if you want progress.

Why would I comment here if I was able to do it myself? like can you think?

anna-rmrf avatar May 28 '20 12:05 anna-rmrf

Hey, as already said this is an Open Source Project, i currently have no time to work on this feature and do not plan on doing that soon, if anyone wants it they can open a PR.

P.S. If you aren't able to do it yourself you can always take this as a learning experience and build up some more knowledge ;)

DevYukine avatar May 30 '20 23:05 DevYukine

Hey, as already said this is an Open Source Project, i currently have no time to work on this feature and do not plan on doing that soon, if anyone wants it they can open a PR.

P.S. If you aren't able to do it yourself you can always take this as a learning experience and build up some more knowledge ;)

Hey there! I was thinking about giving it a try, I think I can make veza use TCP for communication. But not sure how I'd do the spawning. Do you have any suggestions? Thanks.

zihadmahiuddin avatar Jul 28 '20 13:07 zihadmahiuddin

Possibly you need a master server and slaves and each of then will register and get assigned their shardId and how many shards one handles.

PLASMAchicken avatar Jul 28 '20 15:07 PLASMAchicken

Right. I thought about that too. But not sure how I can structure it. I was thinking maybe ShardingManager would take an optional mode option. There would be 3 modes. A default, local mode that does the same thing as Kurasuta does now. The second mode would be a gateway thing that would take care of things like how many shards, how many slaves, which slaves are running which shards, etc. There will be only one instance of this mode. And the other mode would connect to the 2nd mode to know how many more shards it needs etc. Then it will run only that amount of shards. And in this mode, broadcastEval etc. would not eval in the shards of the current machine but it would send the eval to the first mode instance (the gateway thing) and that would then send it to all the machines connected and return the values from all of them and so on. But I am not very confident on this method. It could even have huge flaws or something. Let me know what you think. Also, should we keep it here or maybe continue in discord if you want?

zihadmahiuddin avatar Jul 30 '20 16:07 zihadmahiuddin

any updates to this? this would be really cool

arpanr avatar Jun 16 '21 15:06 arpanr

With the default discord.js sharding manager you can pass it an array of shard IDs that you want it to manage. This means that you can get it to manage a different chunk of a bot's shards on different machines. Maybe a similar solution could be implemented here?

(I saw that this had been mentioned in passing but I thought I'd state it explicitly)

edazpotato avatar Nov 05 '21 08:11 edazpotato

With the default discord.js sharding manager you can pass it an array of shard IDs that you want it to manage. This means that you can get it to manage a different chunk of a bot's shards on different machines. Maybe a similar solution could be implemented here?

(I saw that this had been mentioned in passing but I thought I'd state it explicitly)

yes but you would need a "manager" of some sort that will keep track of which shards are currently connected, which shards are not, and how many shards should be connected, etc. I have a kinda working "demo" here but it's far from stable. Just a PoC of some sort rn.

zihadmahiuddin avatar Nov 05 '21 10:11 zihadmahiuddin

Any new progress on this? I've been looking around and there doesn't seem to be a user-friendly way to shard across multiple machines. I've set up my own TCP server that works relatively well but it's tedious. Some solution like this would be awesome! (: edit: this repo works quite well for anyone looking to do this

maxschnee-dev avatar Mar 01 '22 05:03 maxschnee-dev

@Rebble69 its still the sams as what i said in https://github.com/DevYukine/Kurasuta/issues/204#issuecomment-636397665, i personally do not have time to work on this but if someone makes a PR im free to merge it assuming it still works and follows my general code style

DevYukine avatar Mar 13 '22 23:03 DevYukine

With permission from the maintainer, I'll be sharing this issue: https://github.com/discordjs/discord.js/issues/8084

It's basically a library-agnostic sharder system (needs to be in order to support bots that are made with discord.js's component libraries without the main library) powered by composable strategies for maximum control over every component of it.

Compared to Kurasuta (and current discord.js sharder), it's a low-level highly-customizable strategy-based sharder system. It may not be very easy to use, specially since it's not tightly integrated for spawning websocket shards, but rather focuses on the process/worker sharding (leaving gateway ones for @discordjs/ws or similar).

The linked RFC also features proxy managers with load-balancing, and we're looking for developers who have worked on similar things to provide us some insight or advices 🙏🏼

kyranet avatar Jun 13 '22 09:06 kyranet

I could give you a detailed insight. Since I tweaked a lot with my packages (mentioned by @Rebble69). From what I can say, that it will not be easy.

Firstly, the broadcastEval/Eval part should be removed. Its quite a bad practice doing it. Users end up executing scripts, which can end up in a security breach. The message handler approach would be better. You can send a op code with the message type stats and now the user themselves can call the stats function on the message event.

Cross communication should be of the major part of the Proxies. The easiest approach would be making a master process called bridge, which will then distrubute the messages to the sharding managers and coordinate them. It will be like a p2p connection.

When the upper info is helpful?, then I could elaborate the upper statements and some new ones.....

meister03 avatar Jun 13 '22 14:06 meister03

(I didnt want to comment on the RFC Sharder issue with this as it is not directly helpful)

I implemented d.js sharding across multiple machines, using a VPC so the websocket ports aren't exposed to the internet. I wrote a guide but it's slightly outdated and probably has some pitfalls that aren't desirable: https://jmtk.co/blog/24. I've been using this method for 8 months now without any major issue and I saved some money as a result

The new ShardProxy concept does look like it would be a great replacement going forward and I would look to use that assuming it fulfilled all my needs.

JMTK avatar Jun 13 '22 14:06 JMTK

Replying to each other... first to @meister03:

Firstly, the broadcastEval/Eval part should be removed. Its quite a bad practice doing it. Users end up executing scripts, which can end up in a security breach. The message handler approach would be better. You can send a op code with the message type stats and now the user themselves can call the stats function on the message event.

Yes, indeed. It's one of the first things I wanted to remove from discord.js's sharder, mostly due to security concerns, but also because it limits how we can format/structure the payloads.

Cross communication should be of the major part of the Proxies. The easiest approach would be making a master process called bridge, which will then distrubute the messages to the sharding managers and coordinate them. It will be like a p2p connection.

I disagree, the major part of the proxies is to load-balance its own cluster of shards. Communication should still be done by dedicated systems, but needless to say, the sharder will feature a very complete and powerful system that can be used for any priority task. The least they try to do, the better.


And now to you, @JMTK:

(I didnt want to comment on the RFC Sharder issue with this as it is not directly helpful)

Sure, I guess.

I implemented d.js sharding across multiple machines, using a VPC so the websocket ports aren't exposed to the internet. I wrote a guide but it's slightly outdated and probably has some pitfalls that aren't desirable: jmtk.co/blog/24. I've been using this method for 8 months now without any major issue and I saved some money as a result

The new ShardProxy concept does look like it would be a great replacement going forward and I would look to use that assuming it fulfilled all my needs.

The VPC bit is maybe covered under "There might also be a need to support SSH tunnels to bypass firewalls for greater security" which is at ShardManagerProxy's first paragraph. If there are other ways of making tunnels, I suppose we can look into it. DigitalOcean seems to have a lot of custom stuff too, and since I'm not a DO customer, I'm unaware of many of their technologies. I'm open to explore it in the future, although chances are that it will have to be done outside of the main package, I don't know, time will tell.

Similar to VPC, there's also VPN, which allows you to connect to a protected/unexposed network from the Internet in a secure way.


I have also edited the issue to add a few points to address some questions regarding the reliability and distributability of the network system, including a mention for fallback mirror managers for higher resilience to downtime.

kyranet avatar Jun 13 '22 15:06 kyranet