meteor-presence icon indicating copy to clipboard operation
meteor-presence copied to clipboard

Multi-server instance filtering and no remove on startup

Open aaronthorp opened this issue 11 years ago • 23 comments

enabled two methods depending on environment settings inside process.env.*.

#export PRESENCE_DONT_CLEAR=true
export PRESENCE_INSTANCE='testing-server-001'
  • PRESENCE_DONT_CLEAR - will just not clear the Presence collection, need to manually process it in the code.
  • PRESENCE_INSTANCE - setting this will allocate a server instance identifier to each Presence document connected to that server. On restart it will generate a new serverInstanceId and remove any other records from old instances on the specified server only.
  • If no environment variables specified, it will run as it currently does.
{
  instance: {
    name: "testing-server-001",
    id: "33erfljrkrj3hdkjhd3"
  }
}

aaronthorp avatar Apr 08 '14 06:04 aaronthorp

hey thanks @aaronthorp this looks mostly pretty good, i just wonder though if we can handle multi-server setups transparently? without requiring any configuration on behalf of the user, so it will just work.

dburles avatar Apr 08 '14 06:04 dburles

yeah i was trying to work out how to do that, my thoughts were that may be able to do it via IP and port of the server and then MD5 that into the document along with the instance id, but haven't had a look where to source that from yet in Meteor

aaronthorp avatar Apr 08 '14 06:04 aaronthorp

I might ping @tmeasday in on this too

dburles avatar Apr 08 '14 06:04 dburles

Some other things that might be worth considering:

  • Just letting the heartbeat time it out
  • Running a regular job to inspect Meteor's internal sessions and remove any presences that aren't claimed by a server (not sure exactly how to coordinate this one across servers)
  • Killing presences on HCR, before the user re-connects (this is the main issue we are worried about right?)

tmeasday avatar Apr 08 '14 06:04 tmeasday

The main issue we had is the blanket remove that runs on server startup, if this was say a secondary server it would remove all the current presences

dburles avatar Apr 08 '14 06:04 dburles

Oh, no I get that. My question is why the blanket remove is there in the first place...

tmeasday avatar Apr 08 '14 06:04 tmeasday

Oh because if you shut down the server, all the presences would remain and would never be removed, keep in mind that I want to remove the heartbeat at some point

dburles avatar Apr 08 '14 06:04 dburles

Hmm.. Thinking about it more, maybe the correct architecture is this:

  1. Each server maintains a map of connection ids to states, using a heartbeat or otherwise to know which connection ids are valid.
  2. Each server regularly writes it's own set of states to the Presences collection, with a timestamp.
  3. If a server goes down, it's presences will become "out of date"
  4. Each server regularly deletes "out of date" presences.

So in a sense each server is doing a heartbeat to the presence collection. It's a pain to be doing all those writes though...

tmeasday avatar Apr 08 '14 06:04 tmeasday

Could there be a collection for the server instances for background use with a heartbeat for the server instances updated there with the generated instanceId on startup then remove all server instances > 30 seconds old and remove corresponding items in the Presence collection relating to that instance?

That would stop the constant update of the Presence collection to client on the heartbeat?

aaronthorp avatar Apr 08 '14 07:04 aaronthorp

Just as a side note: the current heartbeat setup I plan on removing, but it's there as we've had some issues with ghost users, however I'm hoping that later versions of Meteor have corrected that issue

dburles avatar Apr 08 '14 07:04 dburles

@aaronthorp - Good idea! I was thinking something similar.

tmeasday avatar Apr 08 '14 07:04 tmeasday

hey @mizzao I believe you may have had some ideas on handling multi-server setups

dburles avatar Apr 08 '14 07:04 dburles

So the current "working" idea is this (probably pretty similar to what @philcockfield described here: https://groups.google.com/d/msg/meteor-talk/_T0BRGhdrgE/LhTeJ2IbY3kJ)

  1. Setup a ServerPresence collection which servers write a unique id and timestamp to
    • [note we'd need to synchronise clocks between servers too.. not sure how to do this, maybe a first version can assume they all run ntp]
  2. Augment the current presence with a serverId.
  3. Each server watches for out-of-date records in the ServerPresence collection and deletes it along with corresponding presences.

tmeasday avatar Apr 08 '14 07:04 tmeasday

Hey guys - user-status currently only works on one server right now but I've considered how to extend it to multiple servers. I think @dburles is using the main idea from user-status right now.

The collection that keeps track of user connections is currently in-memory; we'd have to move that to the database. However, a bigger issue (as you have been discussing) is to keep track of which server is responsible for which connection. Moreover, we'd want this to work as servers are added and removed, so that connections to downed servers are pruned and connections from new servers are properly recorded.

The way that the connections work across HCR right now is a special case of this - because we know we are the only server handling connections, we remove all connections that we had before because all the clients are going to reconnect. However, in a multi-server case, we'd want to remove connections that only we were handling, and also remove connections from servers that have gone down.

I don't think any heartbeats are necessary (especially given that they are already implemented over DDP); however an issue is how to clear connections from servers that go down, as we can now only run code when a server starts up. For time syncing from client-server (and server-server), we can extend something like timesync which implements basic client-server NTP at the moment.

Overall, I'm not a huge fan of having the community maintain two libraries that do almost the same thing, so I would be in favor of merging the capabilities at some point.

mizzao avatar Apr 08 '14 15:04 mizzao

DDP heartbeats in Meteor 0.8.1 should make this library and mine more robust.

mizzao avatar Apr 30 '14 18:04 mizzao

@mizzao yes indeed!

dburles avatar May 01 '14 00:05 dburles

Hi guys, any updates on this? Why was this PR not accept? I'd be happy to help to get this done, as it is very important for our meteor chat package to work on heavy loads :) :+1:

engelgabriel avatar Jan 13 '15 23:01 engelgabriel

@engelgabriel I believe as there's some technical decisions to be made regarding the proper implementation

dburles avatar Jan 13 '15 23:01 dburles

there is probably better and more available ways to do some of the processes in the current version of Meteor, such as connection.onClose etc to remove connections etc, would have been a few updates since this was done a while ago (around 0.8ish if i remember correctly)

aaronthorp avatar Jan 13 '15 23:01 aaronthorp

I haven't had the resources to really look into multi server support at this stage. @engelgabriel are you running into performance issues with the package currently?

dburles avatar Jan 13 '15 23:01 dburles

Yes, we are running into performance issues on Meteor in general, so we are running multiple instances of our meteor apps... and to complicate things even further, we have differents app for different purposes using the same users collections, so they need to play nice together with regards of setting users online/offline.

I am planning to improve the current user presence packages to work properly with multiple meteor app instances. The first issue I need to resolve is that I need to track all connections each users has to the app, and to which instance of the app each connection belongs to, so in case an instance goes down/crash, I can remove all orphan connections.

The initial idea is for each instance to auto generate an InstanceID on startup and save it in an "instances" collections, with a "createdAt", plus some more useful info, like app version, hostname, etc and a "heartbeat".

We will create a timer to set all instances with an expired "heartbeat" to {active: false} and subsequently pull all connections $nin instances.find({active: true}).

Does anyone know of any existing packages that may help or be used for that? Or have any suggestions at all?

engelgabriel avatar Jan 19 '15 18:01 engelgabriel

Keep up the great work, it's a very important module to have for the community.

Thank you!

snshn avatar Jan 19 '15 18:01 snshn

Hi all, with the following goals

  • Supporting presence in projects with multiple instances.
  • Handle server crash/restart gracefully.
  • Allow user to set its preferred status: Online, Away, Busy, Offline
  • Allow user to have the same presence across tabs, browsers or devices.

We created 2 packages:

The first, if you have multiple instances of your app running (on same server or not) we need to keep track of them all, so it handle orphan sessions. https://atmospherejs.com/konecty/multiple-instances-status

The second is the user-presence package itself: https://atmospherejs.com/konecty/user-presence

Checkout out the source code of the chat example at: https://github.com/Konecty/meteor-user-presence-example-chat

And finally, you can see it all working together at: http://user-presence-example.meteor.com/

We will be added a full manual on the readme.md tomorrow, but initial comments are welcome.

engelgabriel avatar Jan 20 '15 17:01 engelgabriel