matrix-rust-sdk icon indicating copy to clipboard operation
matrix-rust-sdk copied to clipboard

Allow room keys for inbound group sessions to be updated when we receive a "better" copy

Open BillCarsonFr opened this issue 1 year ago • 5 comments

When we receive a Megolm key (session) with a particular session ID from somewhere, we need to store it into the local store. The key is flagged with imported if it was not received directly from a m.room_key (file import, backup, ..)

Messages decrypted with an imported key should have a warning (authenticity cannot be guaranteed on this device a.k.a gray shield)

It is possible to receive the same key from different sources (and at different ratchet index) at different times. Some scenario where it creates unwanted shields:

Scenario 1: ratcheted key after root downloaded from backup

Alice create a new login, get access to key backup. The key s0 (owned by bob) is in backup at index 0. There are 3 messages encrypted with that index. If bob sends now a new message, the ratcheted key s0 will be distributed to Alice's new login via a m.room_key As per the current algorithm https://github.com/matrix-org/matrix-rust-sdk/blob/62137e5a3ef032a5f4c3ff77f98f5d0051bf55ad/crates/matrix-sdk-crypto/src/machine.rs#L839-L846

=> the ratcheted key will be discarded and we keep the better one, but it is still imported=true (the ratcheted key is imported=false)

Scenario 2: Better key downloaded from backup replaces a ratcheted key received as a m.room_key

https://github.com/matrix-org/matrix-rust-sdk/blob/62137e5a3ef032a5f4c3ff77f98f5d0051bf55ad/crates/matrix-sdk-crypto/src/store/mod.rs#L1702-L1704

The backup key will replace the existing non imported key

See https://github.com/element-hq/element-web/issues/26526

Solution

Implement proper megolm key updating. When receiving a m.room_key, If we have an existing better key that is mark as imported, we could keep the existing better key and update the imported flag (if the key connects properly)

See legacy implementation https://github.com/element-hq/element-android/blob/7073b1647c3897b5a30c4886db5975a26f16c6a1/matrix-sdk-android/src/kotlinCrypto/java/org/matrix/android/sdk/internal/crypto/MXOlmDevice.kt#L667

See full algorithm details here https://docs.google.com/document/d/1IJkeUn5rNehDapKY75xQ9rutoElVjQHeSdw8dQpb_O0/edit

BillCarsonFr avatar Jul 16 '24 09:07 BillCarsonFr

https://github.com/element-hq/element-x-ios-rageshakes/issues/2216

BillCarsonFr avatar Sep 17 '24 16:09 BillCarsonFr

Other instance https://github.com/element-hq/element-x-ios-rageshakes/issues/2260

BillCarsonFr avatar Sep 25 '24 11:09 BillCarsonFr

Scenario 3: Backup faster than key

Bob's receives a message before the keys, then queries the backup. The key was already uploaded to backup (by a session that was online at that time). Bob get's the key from backup first, and the to_device carrying the room_key arrives after. The to_device key is not better so the one from backup is kept and will generate authenticity warnings

This scenario seems to be more common with simplified sliding sync.

BillCarsonFr avatar Sep 25 '24 11:09 BillCarsonFr

Now that we have the SenderData and it has a compare method, something like this might help:

diff --git a/crates/matrix-sdk-crypto/src/olm/group_sessions/inbound.rs b/crates/matrix-sdk-crypto/src/olm/group_sessions/inbound.rs
index f43e013e4..58c0579c7 100644
--- a/crates/matrix-sdk-crypto/src/olm/group_sessions/inbound.rs
+++ b/crates/matrix-sdk-crypto/src/olm/group_sessions/inbound.rs
@@ -13,6 +13,7 @@
 // limitations under the License.
 
 use std::{
+    cmp::Ordering,
     fmt,
     ops::Deref,
     sync::{
@@ -389,8 +390,18 @@ impl InboundGroupSession {
         {
             SessionOrdering::Unconnected
         } else {
-            let mut other = other.inner.lock().await;
-            self.inner.lock().await.compare(&mut other)
+            let mut other_inner = other.inner.lock().await;
+
+            match self.inner.lock().await.compare(&mut other_inner) {
+                SessionOrdering::Equal => {
+                    match self.sender_data.compare_trust_level(&other.sender_data) {
+                        Ordering::Less => SessionOrdering::Worse,
+                        Ordering::Equal => SessionOrdering::Equal,
+                        Ordering::Greater => SessionOrdering::Better,
+                    }
+                }
+                result => result,
+            }
         }
     }

poljar avatar Sep 25 '24 12:09 poljar

filed at app layer in https://github.com/element-hq/element-x-ios/issues/3352

ara4n avatar Sep 30 '24 14:09 ara4n

There is another scenario were this occurs, and counter intuitively the late key is received before getting the response from the backup: https://rageshakes.element.io/api/listing/2024-11-06/130841-YOSPUGGB/console.2024-11-06-12.log.gz

First the fail to decrypt: 2024-11-06T12:16:29.134312Z WARN matrix_sdk_crypto::machine: Failed to decrypt a room event: session_id="PmaDYpHYvSkoDVJgCAV6y9NddsRqr1bd5gGm665AyOs" message_index=0

Then query to key storage: 2024-11-06T12:16:44.354410Z DEBUG method=GET uri="https://matrix-client.matrix.org/_matrix/client/v3/room_keys/keys/!XXXXXXXXX:matrix.org/PmaDYpHYvSkoDVJgCAV6y9NddsRqr1bd5gGm665AyOs"}

Just after the late key arrives: 2024-11-06T12:16:44.387778Z INFO matrix_sdk_crypto::machine: Received a new megolm room key | handle_decrypted_to_device_event ... event_type="m.room_key" .. > add_room_key{...} > handle_key { ...session_id="PmaDYpHYvSkoDVJgCAV6y9NddsRqr1bd5gGm665AyOs"}

Then got the backup response: 2024-11-06T12:16:44.402430Z DEBUG matrix_sdk::http_client: Got response request_id="REQ-14" method=GET uri=".../room_keys/keys/!XXXXXXXX:matrix.org/PmaDYpHYvSkoDVJgCAV6y9NddsRqr1bd5gGm665AyOs" status=200...}

And that key is imported also via import_room_keys 2024-11-06T12:16:44.457412Z INFO matrix_sdk_crypto::store: Successfully imported room keys total_count=1 imported_count=1

And for some reason this last import is the one persisted (has gray shield because of warning)

Possible scenario:

The handle_key and import_room_keys both tries to check if there is an existing key. They found that there is none. So they both want to save it to the store.

The room key from sync calls save_pending_changes that has a save_changes_lock. But import_room_keys directly save the key to the store, there are no lock. So we miss some synchronisation here, and eventually reconcillation?

BillCarsonFr avatar Nov 06 '24 15:11 BillCarsonFr

Created a specific issue for the race that allows to by-pass the check https://github.com/matrix-org/matrix-rust-sdk/issues/4227

BillCarsonFr avatar Nov 07 '24 08:11 BillCarsonFr

@poljar I'm struggling to understand the state of play here. I see https://github.com/matrix-org/matrix-rust-sdk/pull/4040, but that is apparently only a partial fix, so could you clarify what remains to be done?

I think it might be helpful to close this issue and replace it with one that makes the current situation clear. WDYT?

richvdh avatar Feb 19 '25 16:02 richvdh

The solution from OP was never implemented. But sure, some noise has accumulated in this issue.

Opened https://github.com/matrix-org/matrix-rust-sdk/issues/4698 as a replacement.

poljar avatar Feb 20 '25 10:02 poljar