element-x-ios icon indicating copy to clipboard operation
element-x-ios copied to clipboard

Lost OTK, leading to "OneTime key already exists" error and later UTDs

Open richvdh opened this issue 5 months ago • 6 comments

This is a reprisal of an old issue (https://github.com/matrix-org/matrix-rust-sdk/issues/1415) which we thought we'd fixed, but seems to be back.

Server-side and client-side logs suggest that Element X iOS is creating one-time keys, uploading them to the server, and then forgetting about them.

This essentially guarantees that, at some point down the line, the user is going to receive an undecryptable message.

The manifestation is that both server-side and client side logs are full of errors about "One time key signed_curve25519:AAAAAAAAAUM already exists."

The client retries every few seconds, so it's also a waste of bandwidth on both sides.

It's also problematic that there is no indication in the UI that there is any problem, so the first we know of it is when the user receives a UTD several weeks later.

richvdh avatar Jul 31 '25 15:07 richvdh

The linked rageshake from Amandine suggests that the cross-process lock isn't doing what it's supposed to.

richvdh avatar Jul 31 '25 17:07 richvdh

I've previously questioned whether the cross-process lock actually works. This looks like more evidence that it does not.

richvdh avatar Jul 31 '25 17:07 richvdh

@poljar will make sure this is reported to Sentry so we can see how many people are affected.

andybalaam avatar Aug 04 '25 13:08 andybalaam

@poljar will make sure this is reported to Sentry so we can see how many people are affected.

PR is here: https://github.com/matrix-org/matrix-rust-sdk/pull/5496

poljar avatar Aug 06 '25 15:08 poljar

PR to report things only once per Client is here https://github.com/matrix-org/matrix-rust-sdk/pull/5588. I forgot about this despite @richvdh warnings that this will be a problem. 🤦

poljar avatar Aug 28 '25 07:08 poljar

Now that we have some metrics to report on this, it appears that there are very few affected users. Accordingly, we're going to deprioritise it.

(Aside: it appears that there may also be an issue in EW which causes lost OTKs, but we haven't got any rageshakes from affected users and anyway that is likely a separate problem)

richvdh avatar Sep 29 '25 13:09 richvdh