meteor-user-status icon indicating copy to clipboard operation
meteor-user-status copied to clipboard

Multiple server support

Open mizzao opened this issue 10 years ago • 24 comments

As Meteor begins to deploy in a distributed fashion, this package will need to work across multiple servers. At a minimum, we'll need to do the following:

  • move UserConnections into the database from an in-memory collection
  • Have servers track their connections so as not to step on each other
  • Remove connections from servers that have gone down (i.e. what we do in Meteor.startup right now after a HCR)

A more extensive discussion of this is in dburles/meteor-presence#7.

mizzao avatar Apr 08 '14 18:04 mizzao

To do this correctly we'll need to wait until Meteor has server up/down hooks, similar to the proposal in https://groups.google.com/forum/#!topic/meteor-core/No6W8layBDs.

This will also be rather complicated because we won't want multiple servers updating the Meteor.users collection. Perhaps only the server who observes a change in connection state should update the collection, but this will lead to all sorts of possible race conditions.

mizzao avatar Apr 30 '14 02:04 mizzao

We use something in our multiple server environment with our balancer having sticky clients. so that the same sever gets all requests from an IP address... than the issues are far less, but sure that is not a perfect scenario.

piffie avatar Jul 16 '14 15:07 piffie

@piffie: that sounds okay as long as each server is handling its clients using only local data. But I'm talking about being able to maintain all connected users' status across a cluster of servers. It'll probably have to wait until the Galaxy APIs are released and documented.

mizzao avatar Jul 16 '14 15:07 mizzao

@piffie: are you using user-status in that environment? I guess it would mostly work, except if the same user logged on from different IPs. (Only users would be available across servers though, not connections.)

mizzao avatar Jul 24 '14 16:07 mizzao

@mizzao I'm using user-status to monitor the status of each of my servers that are in a cluster. Basically, I have a load balancer on one sever, and a number of app servers behind the load balancer (each on separate servers). I am able to distinguish between the servers by having each server list it's connections by serverName (I created a process.env.serverName variable by setting SERVERNAME in bash). If you'd like, I'm happy to share.

Art1Sec8 avatar Jul 28 '14 13:07 Art1Sec8

Hey @Art1Sec8 - interesting application. How and where are you collecting the list of connections across all servers? I think what you're doing is pretty useful and sharing would be great!

mizzao avatar Jul 28 '14 14:07 mizzao

@mizzao - I am assuming that all of the servers are using the same database, so I allow all the servers to write to a collection called ServerStatus

take a look at: https://github.com/Art1Sec8/meteor-user-status/blob/master/serverStatus.coffee

Basically, each server reports its own connections under its servername. That way, you can see what connections a server has, or run query to see all connections across the servers. Is this what you had in mind?

Art1Sec8 avatar Jul 29 '14 16:07 Art1Sec8

Interesting concept.

So... why keep all of the connections for each server in an array? That seems like it would be quite slow, especially without an index on connections.connectionId, which you should add. Other comments:

  • Using hostname for serverName would probably break things if multiple meteor servers were being run on the same server (to scale up on CPU cores).
  • This probably (almost surely) breaks the multiplexing of connections into Meteor.users.
  • You probably want to flush all the connections for a server when it goes down, not when it comes up. If it goes down and doesn't come up, there will be a lot of imaginary connections.

mizzao avatar Jul 29 '14 17:07 mizzao

The code I posted was being used in a different context than what you are proposing, so I think that's why it may not make as much sense. Since we created our own server farm, we needed to get the status of the app servers, and determine if there were connections to the servers (so we could perform upgrades etc.). We also needed to know some information about the health of the server, memory usage, etc. So, the format may not be the best for meteor-user-status as is, but I can work it around a little to help out.

So, to your questions!

The connections were in an array because we were including other information in each document. And, each document represented a server. So, a single document would include all the information the a particular server. Of course, you're right, an index on connections.connectionId would certainly speed things up; thanks for the catch!

For meteor-user-status, perhaps we should just have the connections in the document; I can change that code.

Re hostname for serverName: If each instance of meteor is assigned to a core, then, they would likely be in different processes. So, how about:

  os = new Npm.require('os')
  serverName = os.hostname() + '-' + process.pid

Re multiplexing: We use sticky sessions on the load balancer to try to get around this. But, yes, there are scenarios where it will break. I'm hoping to work through a solution with you.

Re flushing connections: I haven't found a good way to flush the connections when the server goes down unexpectedly. For example, when digital ocean dies, everything is gone. So, we try to flush the cache on a normal shutdown (which I forgot to include) and on the server startup (in case there wasn't a graceful fail).

Art1Sec8 avatar Jul 30 '14 13:07 Art1Sec8

I am planning to improve the current user presence packages to work properly with multiple meteor app instances. The first issue I need to resolve is that I need to track all connections each users has to the app, and to which instance of the app each connection belongs to, so in case an instance goes down/crash, I can remove all orphan connections.

The initial idea is for each instance to auto generate an InstanceID on startup and save it in an "instances" collections, with a "createdAt", plus some more useful info, like app version, hostname, etc and a "heartbeat".

We will create a timer to set all instances with an expired "heartbeat" to {active: false} and subsequently pull all connections $nin instances.find({active: true}).

Does anyone know of any existing packages that may help or be used for that? Or have any suggestions at all?

Cheers!

engelgabriel avatar Jan 19 '15 17:01 engelgabriel

You can just use the connection ID. thats a unique identifier of a client. Meteor currently only works with sticky servers (so that a single user is allways routed to the same server for the complete connection lifetime) so there cannot be an issue that the same client is connected to more than one server. you could store the server hostname in the user profile and nin all users on startup that has been connected to that server.

piffie avatar Jan 19 '15 19:01 piffie

I don't think this is 100% reliable... at least for our use case, a Web Chat App, the same user may be connected on the desktop and on the mobile at the same time. Or like I said above, the user may be connected to different apps running in different servers that share the same DB, users and status.

So I think each user may have multiple connection IDs on multiple servers, or am I missing something?

engelgabriel avatar Jan 19 '15 20:01 engelgabriel

Okay, so you mean logged in users.... Then make a array of active connections. an entry has the connection id and the hostname. and if you empty the array, then you set the user offline.

piffie avatar Jan 19 '15 20:01 piffie

Thats exactly what I am doing, but instead of the "hostname" I am keeping a InstanceID, as we have 1 meteor instance per CPU on each host. So I need to know what instances are still alive, so I can pull from the array all the connections that belong to a crashed/non-responding server.

engelgabriel avatar Jan 19 '15 20:01 engelgabriel

This is just the package to track the servers: https://github.com/Konecty/meteor-multiple-instances-status Next we will adapt the user status package to use this information

engelgabriel avatar Jan 19 '15 20:01 engelgabriel

Hi all, with the following goals

  • Supporting presence in projects with multiple instances.
  • Handle server crash/restart gracefully.
  • Allow user to set its preferred status: Online, Away, Busy, Offline
  • Allow user to have the same presence across tabs, browsers or devices.

We created 2 packages:

The first, if you have multiple instances of your app running (on same server or not) we need to keep track of them all, so it handle orphan sessions. https://atmospherejs.com/konecty/multiple-instances-status

The second is the user-presence package itself: https://atmospherejs.com/konecty/user-presence

Checkout out the source code of the chat example at: https://github.com/Konecty/meteor-user-presence-example-chat

And finally, you can see it all working together at: http://user-presence-example.meteor.com/

We will be added a full manual on the readme.md tomorrow, but initial comments are welcome.

engelgabriel avatar Jan 20 '15 17:01 engelgabriel

@engelgabriel I've been traveling lately, but it seems like you are on the right track.

The most tricky part going from single servers to multiple servers is the clearing of sessions associated with dead servers. It seems you've already gotten several suggestions for that, but there are definitely efficient vs. less efficient ways to do that (and I don't think you'll need to use heartbeats.)

Good luck with your packages!

mizzao avatar Jan 21 '15 18:01 mizzao

We are doing the heartbeats only on the "instance" level.. so it is pretty harmless. I can't think of any other way of being 100% sure that an instance is still running or not. When instance is consider dead by the others, they will elect an instance to pull the connections with that instance id from all sessions at once.

We've tested it in our production environment, with 8 instance and over 100 online users distributed across all instances. All seem to be working just fine, even during deployment of new versions of the app! :smiley:

engelgabriel avatar Jan 21 '15 18:01 engelgabriel

Yep, that sounds like the right way to do it. The devil is in the details for such a distributed system - "considered dead by the others" can be tricky if a server's connection is temporarily interrupted to some but not others. Most things will work just fine in normal operation, but you may see strange results under high traffic load or network hiccups.

mizzao avatar Jan 21 '15 22:01 mizzao

@mizzao

I stumbled across this when considering implementing meteorhacks:cluster.

It seems like some of the API will not work when using multiple servers but for me it seems like online, offline, and idle still work for me even with multiple servers because I am using the status keys on the users profile.

The package updates the users document with it's status. Since all my meteor servers are connected to the same database, the same user document is being updated. So once this updated user information arrives at any client who is subscribed to receive it, they can show the appropriate offline / online status, even if I am using multiple servers.

So please correct me if I am wrong, but the features of this package that update the user's document will still work even if using multiple meteor servers.

sferoze avatar Oct 08 '16 23:10 sferoze

@sferoze: I don't think the status fields will multiplex properly because currently it assumes that the connections in memory are the only ones the users have to the server. If users are only making single connections then it will mostly work, but if they are logging on through multiple locations (and through different servers) the state will be wrong.

Also, note that the server clears all user statuses when it starts up (see original post). This will clearly break things if servers are being added and removed with load balancing.

mizzao avatar Oct 10 '16 19:10 mizzao

ahh i see the issue now.

sferoze avatar Oct 10 '16 20:10 sferoze

@mizzao what about one server running an instance on each core using meteorhacks:cluster?

Since it's just one server upon restart clearing everything is ok.

But I guess the issue comes up if one user is connected the the instance on the first core, and on another device the user is connected to an instance on the second core?

sferoze avatar Oct 11 '16 21:10 sferoze

Yes. It's unclear how often that situation would arise, but that would definitely break things.

I think the aforementioned https://github.com/Konecty/meteor-multiple-instances-status is supposed to handle this, but I've never used it. You may want to try that.

mizzao avatar Oct 11 '16 21:10 mizzao