element-meta icon indicating copy to clipboard operation
element-meta copied to clipboard

Users whose servers were unreachable when you logged in will send you undecryptable messages

Open richvdh opened this issue 1 year ago • 4 comments

Migrating from https://github.com/element-hq/synapse/issues/2165:


  • Alice logs in on a new device
  • Alice's server tries to tell everyone about her new device
  • Bob's server is unreachable at that moment
  • Later, Bob sends a message. He doesn't know about Alice's new device. Alice sees a UISI.

richvdh avatar Feb 28 '24 16:02 richvdh

A solution to this might go something along the lines of:

  • We change the send-to-device API to group messages for each target user together.
  • Now, when Bob tries to send the key for a megolm session to Alice, he includes a hash of Alice's device list.
  • When Alice's server receives that batch of to-device messages, it can tell if the list is outdated, and send an indication back to Bob via Bob's server
  • Bob's client updates its copy of Alice's device list and tries again.

This would require Bob's client to keep a journal of which users it tried to send a given megolm key to (which might also be useful for dealing with wedged olm sessions more promptly (https://github.com/element-hq/element-meta/issues/1992). (That might well be be better done after MegolmV2?)

richvdh avatar Feb 28 '24 17:02 richvdh

When Bob's server sends the to-device messages to Alice's server, it could also include the stream_id of the latest m.device_list_update that it received for that user, and Alice's server could give an indication as to whether Bob's server is up-to-date or not.

uhoreg avatar Mar 06 '24 22:03 uhoreg

it can tell if the list is outdated, and send an indication back to Bob via Bob's server

There is a race condition here where this check returns false (up-to-date list) and before the event is delivered the client logs in on a new device.

kegsay avatar Mar 21 '24 14:03 kegsay

There is a race condition here where this check returns false (up-to-date list) and before the event is delivered the client logs in on a new device.

I'd argue that's less a race, and more that, objectively, Bob logged in after the message was sent, and therefore shouldn't expect to decrypt the event any more than he would if he logged in 3 weeks later. In other words, it's #2313 rather than this issue which is really specific to the parallelism due to federation.

richvdh avatar Mar 21 '24 16:03 richvdh