phoenix_pubsub icon indicating copy to clipboard operation
phoenix_pubsub copied to clipboard

Remote node doesn't receive last update right before process terminates

Open tverlaan opened this issue 6 years ago • 6 comments

We use Phoenix.Tracker for keeping the state of currently active calls. Each call has a certain state in the metadata of the tracker. When a process updates the state in the terminate callback, the update of the metadata is only published locally. The remote node doesn't see this updated metadata but only sees the process leave since it terminated within the broadcast_window.

I made an example project that demoes the above scenario. You can find it here: https://github.com/tverlaan/presence_multinode

We discussed possible improvements and alternative solutions during ElixirConfEU, but we didn't come to a conclusion just yet.

tverlaan avatar Apr 20 '18 09:04 tverlaan

Thanks for the report! To add more info, @chrismccord believes this happen somewhere in presence when we are squeezing the joins and leaves together to push to the client (or in the client itself). We currently treat the changes in state as a leave+join so anything that happens in between is condensed.

josevalim avatar Apr 20 '18 10:04 josevalim

Would it be possible to update meta when leave+join is condensed? Or will it break when another node has a presence update for the same key?

Can you point me to some documentation or a paper where I can read into the implementation by Phoenix a bit more?

tverlaan avatar May 15 '18 07:05 tverlaan

Another node cannot update the same key because the keys are always per node. It’s hard to think about updates because they do not exist at the low level right now as they are leaves and joins.

It is also very important to note that there is no delivery guarantee, so relying on any update change to affect other state can be misleading. For example, imagine this scenario:

  1. Process is present
  2. Process updates meta
  3. Node crashes or process dies

Local clients will see the meta update but remote ones. Can you please remind us what was the state you wanted to see in an update?

Paper: We implement a CRDT right now but we can very likely simplify the abstraction. So reading the CRDT paper will present you the right ideas but the implementation is ultimately simpler.

José Valimwww.plataformatec.com.br http://www.plataformatec.com.br/Founder and Director of R&D

josevalim avatar May 15 '18 08:05 josevalim

We were "updating" meta to reflect the process is about to exit clear or fail.

I'll try to get a better understanding of the implementation to see if and/or how we can improve it. Outcome could be documentation as well I think.

tverlaan avatar May 16 '18 14:05 tverlaan

One way to solve this would be to differentiate (I'm not really sure how) in handle_diff between process explicitly calling untrack and being removed after it died.

michalmuskala avatar May 16 '18 14:05 michalmuskala

@tverlaan actually, I think when you leave, you should have the latest meta:

https://github.com/phoenixframework/phoenix_pubsub/blob/1185fdb258fcba8acd9d66054e3b1ba9a055c83d/lib/phoenix/tracker.ex#L64-L78

This latest meta should have your latest changes. Are you using this information in the client?

In case the leave information does not have the latest meta, then that would be a bug. @michalmuskala, this should also provide you a way to pass the exit information, we could add it to the meta under a reserved key.

josevalim avatar May 17 '18 08:05 josevalim