synapse icon indicating copy to clipboard operation
synapse copied to clipboard

Log when a key backup replaces an existing key

Open andybalaam opened this issue 1 year ago • 2 comments

This issue was created by the Crypto team because we believe it will help us to identify the source of broken key backups. We expect to work on it and hope the Synapse team will assist us in deciding what exactly to do, and helping us with review and merge assuming we can agree on a sensible approach.

Recently the Crypto team has identified broken key backups as a common source of "unable to decrypt" errors (UTDs). Some clients appear to be uploading invalid data, and other clients have sometimes been more strict than the spec in validating key backups. Both of these problems result in UTDs, and it can be difficult when debugging to identify which client uploaded the key, because it may have been replaced after the UTD happened.

So, we propose to write a log message whenever a key is replaced in backup, allowing us to line up UTD reports (rageshakes) with server logs to figure out what happened.

Outcome of this task

  • Synapse writes a log message whenever a key is replaced in key backup, with enough information to align it with the relevant rageshake

andybalaam avatar Mar 20 '24 16:03 andybalaam

Related: https://github.com/matrix-org/synapse/issues/13704

kegsay avatar Mar 20 '24 17:03 kegsay

In handlers/e2e_room_keys.py

            for room_id, room in room_keys["rooms"].items():
                for session_id, room_key in room["sessions"].items():
                    if not isinstance(room_key["is_verified"], bool):
                        msg = (
                            "is_verified must be a boolean in keys for session %s in"
                            "room %s" % (session_id, room_id)
                        )
                        raise SynapseError(400, msg, Codes.INVALID_PARAM)

                    log_kv(
                        {
                            "message": "Trying to upload room key",
                            "room_id": room_id,
                            "session_id": session_id,
                            "user_id": user_id,
                        }
                    )
                    current_room_key = existing_keys.get(room_id, {}).get(session_id)
                    if current_room_key:
                        if self._should_replace_room_key(current_room_key, room_key):
                            log_kv({"message": "Replacing room key."})
                            # updates are done one at a time in the DB, so send
                            # updates right away rather than batching them up,
                            # like we do with the inserts
                            await self.store.update_e2e_room_key(
                                user_id, version, room_id, session_id, room_key
                            )
                            changed = True
                        else:
                            log_kv({"message": "Not replacing room_key."})
                    else:
                        log_kv(
                            {
                                "message": "Room key not found.",
                                "room_id": room_id,
                                "user_id": user_id,
                            }
                        )
                        log_kv({"message": "Replacing room key."})
                        to_insert.append((room_id, session_id, room_key))
                        changed = True

We seem to log to the opentracing, but do you just want regular log lines for these things as well?

(I think that second 'Replacing room key' line might be a mistake as well, actually..)

reivilibre avatar May 16 '24 10:05 reivilibre

We seem to log to the opentracing, but do you just want regular log lines for these things as well?

Yes please. The trouble with opentracing is that it's disabled for most users.

richvdh avatar May 28 '24 08:05 richvdh