deltachat-core-rust icon indicating copy to clipboard operation
deltachat-core-rust copied to clipboard

fix: Also migrate messages in PGP-contacts migration

Open Hocuri opened this issue 6 months ago • 8 comments

Previously, messages were not rewritten. This meant that all messages stayed with the old email-identified contact. #6916 made it very obvious that all messages sent into a group before the PGP-contacts migration got the email avatar.

With this PR, all encrypted messages are rewritten to the PGP-contact identified by the current autocrypt key. It is not possible to find out which key was actually used to sign the message.

For me, this prints:

06-16 16:08:32.444 19307 19345 I DeltaChat: [accId=1] src/sql/migrations.rs:1820: Rewriting msgs for PGP contacts took 3.768890467s
06-16 16:08:41.825 19307 19345 I DeltaChat: [accId=1] src/sql/migrations.rs:1242: PGP contacts migration took 13.516607026s in total

which is slow, but the messages aren't actually the slowest part of it. We may want to try and show a spinner in the UI while the migration is being done, but let's first hear what others report as to how long this takes.

Hocuri avatar Jun 16 '25 13:06 Hocuri

Actually, the messages migration probably is the largest part of it - what took long is committing the SQL transaction. I need to do this in the background.

Hocuri avatar Jun 16 '25 14:06 Hocuri

I'm not exactly sure how to do this migration in the background. One way would be:

  • In the pgp-contacts migration, write a Json of autocrypt_pgp_contacts and autocrypt_pgp_contacts_with_reset_peerstate somewhere into the database (alternatively, create a SQL database with two columns old_contact and pgp_contact out of it). And set a config Config::MigrateMessagesToPgpContacts.
  • In run_migrations(), if Config::MigrateMessagesToPgpContacts is set, spawn a tokio task:
    • start a SQLite transaction
    • check again that Config::MigrateMessagesToPgpContacts is set
    • load the pgp-contacts-map from the database
    • execute the migration which I already implemented in the PR here.
    • clear Config::MigrateMessagesToPgpContacts
    • end of SQLite transaction

Hocuri avatar Jun 16 '25 14:06 Hocuri

With only rewriting the last 10.000 messages, it still took almost 5s, I will try 5.000:

06-16 22:05:31.616  6767  6988 I DeltaChat: [accId=1] src/sql/migrations.rs:1831: Rewriting msgs for PGP contacts took 45.399062ms
06-16 22:05:35.808  6767  6988 I DeltaChat: [accId=1] src/sql/migrations.rs:1243: PGP contacts migration took 4.772880362s in total

Edit: This measurement is not reliable, I lowered to 5000 msgs, and the migration took longer this time:

06-16 22:33:48.914 10194 10225 I DeltaChat: [accId=1] src/sql/migrations.rs:1831: Rewriting msgs for PGP contacts took 21.90323ms
06-16 22:33:54.657 10194 10225 I DeltaChat: [accId=1] src/sql/migrations.rs:1243: PGP contacts migration took 6.132597029s in total

Edit 2: Another run with 5000 msgs:

06-16 23:22:04.710 14394 14561 I DeltaChat: [accId=1] src/sql/migrations.rs:1831: Rewriting msgs for PGP contacts took 21.105573ms
06-16 23:22:09.819 14394 14561 I DeltaChat: [accId=1] src/sql/migrations.rs:1243: PGP contacts migration took 5.619887341s in total

Edit3:

Actually, even without rewriting message-ids, the migration takes multiple seconds:

06-17 13:08:26.312 17893 18094 I DeltaChat: [accId=1] src/sql/migrations.rs:1242: PGP contacts migration took 5.363832446s

So, probably the PR here is fine, the problem is that the PGP-contacts migration is slow in general.

Edit 4: Note that all of these were debug builds, and while we do optimize Sqlite in debg builds, maybe the LTO of release builds improves things a bit

Hocuri avatar Jun 16 '25 20:06 Hocuri

With this PR, all encrypted messages are rewritten to the PGP-contact identified by the current autocrypt key. It is not possible to find out which key was actually used to sign the message.

For 1:1 protected chats this can be solved by assigning to the pgp contact all recent messages until a ChatProtectionEnabled system message is encountered. Earlier messages it's better to assing to the email contact to be on the safe side.

For unprotected (even encrypted) chats this doesn't matter that much, preserving their encryption status (padlock) would be fine i think.

iequidoo avatar Jun 16 '25 20:06 iequidoo

Nothing bad can happen if a few old messages are assigned to a slightly-wrong PGP contact, so I think it's fine. And, keeping the complexity low and moving forward.

Hocuri avatar Jun 18 '25 11:06 Hocuri

Nothing bad can happen if a few old messages are assigned to a slightly-wrong PGP contact, so I think it's fine. And, keeping the complexity low and moving forward.

It's fine if ChatProtectionEnabled and ChatProtectionDisabled system messages are also moved to the PGP chat and preserve their position so that the user can be sure which messages were signed with the actual contact's key and which weren't.

iequidoo avatar Jun 20 '25 16:06 iequidoo

It's fine if ChatProtectionEnabled and ChatProtectionDisabled system messages are also moved to the PGP chat and preserve their position so that the user can be sure which messages were signed with the actual contact's key and which weren't.

Yes, all messages that were in a chat together will still be in a chat together, in the order they had before. Only contact ids will be changed.

Hocuri avatar Jun 20 '25 18:06 Hocuri

Python lint is failing because this is based on the PGP-contacts PR, which wasn't yet rebased to include #6923

Hocuri avatar Jun 21 '25 16:06 Hocuri

CI now passes on PGP-contacts branch, this can be rebased now.

link2xt avatar Jun 22 '25 12:06 link2xt