proton-bridge icon indicating copy to clipboard operation
proton-bridge copied to clipboard

Searching older email under Thunderbird mangles messages.

Open petrmanek opened this issue 2 years ago • 18 comments

The main symptom is that when searching for older email, I run across perfectly good messages interleaved with corrupted ones. A corrupted message looks like any other message (e.g. it has a perfectly sane header), however when I open it, its body is completely unreadable. More often than not, the corrupted body resembles subset of a body of some other, seemingly arbitrarily selected message -- this usually appears as some HTML excerpt without beginning and end. More often than not, this happens to older messages which have attachments.

Expected Behavior

When opening an older message, see its body as it appears in ProtonMail web interface.

Current Behavior

Instead I see a seemingly randomly chosen subset of some other message. This is what it looks like:

20230919_10h36m06s_grim

Possible Solution

Uncertain. I would try finding out something that all corrupted messages have in common. That would probably provide a hint about how this corruption happens and why.

Steps to Reproduce

  1. Set up ProtonMail Bridge and Thunderbird with a mailbox containing at least several months' worth of email.
  2. Open Thunderbird and search for some messages (e.g. using Ctrl+K hotkey).
  3. Find some older messages with attachments in the listing.
  4. Double-click messages to see their contents.

Version Information

  • ArchLinux 6.5.3-arch1-1
  • ProtonMail-Bridge (protonmail-bridge-bin) 3.4.2-1
  • Thunderbird (thunderbird) 115.2.2-1
  • Pass (pass-git) 1:1.7.4.r1.g3dd1469-1

Context (Environment)

Here are only relevant variables:

LANG=cs_CZ.UTF-8
LC_MESSAGES=C
LC_TIME=cs_CZ.UTF-8

Detailed Description

This bug is erratic, and I do not know how easy it will be to reproduce for anybody else. However, I have been running ProtonMail-Bridge with Thunderbird for several years now, and it happens in 80% of all my email searches, which is why I am severely tempted not to search for anything in my email history unless I am logged in via the web interface. I am uncertain whether the error is on the side of the Bridge or Thunderbird, but I figure that the developers should at least know about this behavior. Perhaps there are other users who experience this as well.

As far as I understand the messages are not corrupted online, just inside Thunderbird.

UPDATE: I just looked in the body of a corrupted message that I exported as .eml file. And it appears that there are actually several messages in thee, concatenated including headers and bodies. Interestingly, all messages appear to have been received around the same date. For instance, grep ^Date: corrupted_message.eml yields:

Date: Fri, 08 Sep 2023 21:44:04 +0000
Date: Wed, 13 Sep 2023 16:01:52 +0200
Date: Tue, 12 Sep 2023 22:04:22 +0200
Date: Wed, 13 Sep 2023 10:20:17 +0000
Date: Wed, 13 Sep 2023 09:58:28 +0000
Date: Tue, 12 Sep 2023 13:49:01 -0700
Date: Wed, 13 Sep 2023 14:13:37 +0200
Date: Wed, 13 Sep 2023 15:00:37 +0200
Date: Wed, 13 Sep 2023 03:49:44 -0700
Date: Wed, 13 Sep 2023 03:23:57 -0700
Date: Wed, 13 Sep 2023 02:18:34 -0700
Date: Tue, 12 Sep 2023 22:37:32 -0700
Date: Tue, 12 Sep 2023 11:00:00 +0200
Date: Tue, 12 Sep 2023 11:33:34 +0200
Date: Tue, 12 Sep 2023 15:35:14 +0200
Date: Tue, 12 Sep 2023 03:40:24 -0700
Date: Tue, 12 Sep 2023 14:40:38 +0200
Date: Tue, 12 Sep 2023 06:00:01 +0200
Date: Tue, 12 Sep 2023 07:30:37 +0200
Date: Mon, 11 Sep 2023 21:50:18 +0100
Date: Mon, 11 Sep 2023 16:00:42 -0700

Possible Implementation

No idea.

petrmanek avatar Sep 19 '23 08:09 petrmanek

Hey @petrmanek thanks for the report.

Did you continuously use the same bridge set up in a sequence of updates? E.g: start with v3.0 and update all the way through v3.4.x?

Have you tried removing and adding your account and see if the issue is still there? Before you do you can backup the ~/config/protonmail and ~/.local/share/protonmail folders if you wish to restore this state later.

LBeernaertProton avatar Sep 19 '23 08:09 LBeernaertProton

Did you continuously use the same bridge set up in a sequence of updates? E.g: start with v3.0 and update all the way through v3.4.x?

Yes, I am running a bleeding-edge system, so whenever a new Bridge is released I usually update to it within a day or two.

Have you tried removing and adding your account and see if the issue is still there? Before you do you can backup the ~/config/protonmail and ~/.local/share/protonmail folders if you wish to restore this state later.

I have done that once or twice in the past, but I can try that again at some point this week. The only drawback of that is time -- my inbox is huuuge and takes several hours to sync. I will let you know how it went.

petrmanek avatar Sep 19 '23 09:09 petrmanek

Side note:

The reason I filed this bug report today is that I was looking earlier for a specific message with an attachment in Thunderbird. I correctly found the header of said message, but when I opened it I was presented with all-too-familiar mangled body. Having encountered the bug before, I logged into ProtonMail web interface, found the same message, and retrieved its attachment through my browser.

Here comes the interesting bit. While I was writing this bug report, I left the ProtonMail web interface running in a background tab with the message opened. When I came back to it, the previously opened message was nowhere to be found. It was as if it did not exist in my inbox in the first place. I tried repeating my search several times, and finally found the message again -- but this time it only had its header and body, and was completely stripped of all its attachments. It used to have 2 attached PDF files, now it does not have any. Those 2 files appear to be lost now -- fortunately I backed them up some 30 minutes ago, but this is suspiciously weird behavior that could potentially cost someone their data.

I am a software engineer but I have no idea how to make sense of this. Perhaps the Bridge on my PC somehow tried to resolve a sync conflict by deleting the upstream message, but why would it then copy it back without attachments?

If desirable, I can try fetch some logs from Bridge while they are there. I am afraid that not running any particular debugging/verbose mode, so we may be limited to just conventional log messages. Please let me know if you would find that useful.

petrmanek avatar Sep 19 '23 09:09 petrmanek

We only delete messages if:

  • IMAP client performs an expunge from the trash folder
  • Proton Servers instruct us to delete a message (e.g.: other client triggered the delete)

We are aware that thunderbird has a some delays with the expunge instructions. In the latest thunderbird, we often find ourselves using the compact operation to flush pending expunges. Maybe that's what you saw?

I would recommend, when you have time, to repeat this action on cleanly synced account. It's possible that some old bug managed to stick around during upgrades.

Unfortunately, without knowing the proton message id for this message, looking at the logs would be like finding a needle in haystack :|.

LBeernaertProton avatar Sep 19 '23 09:09 LBeernaertProton

We are aware that thunderbird has a some delays with the expunge instructions. In the latest thunderbird, we often find ourselves using the compact operation to flush pending expunges. Maybe that's what you saw?

It is possible. The saga continues: Now, about 20 minutes after my last comment, the message appears in the web interface again with both PDF attachments intact. I think I am going to start making screenshots of this to prove that I am not crazy. :sweat_smile:

I would recommend, when you have time, to repeat this action on cleanly synced account. It's possible that some old bug managed to stick around during upgrades.

Will do. I am currently at work, so I cannot jeopardize my production environment. However, perhaps later in the week/weekend I can try it out.

Unfortunately, without knowing the proton message id for this message, looking at the logs would be like finding a needle in haystack :|.

Is that identical to the Message-Id header? Can I retrieve this somehow using the web interface?

petrmanek avatar Sep 19 '23 09:09 petrmanek

Is that identical to the Message-Id header? Can I retrieve this somehow using the web interface?

Yes. It should be visible in the url bar https://mail.proton.me/.../<id>. But it is best to disable conversation mode to get the correct id.

LBeernaertProton avatar Sep 19 '23 09:09 LBeernaertProton

Yes. It should be visible in the url bar https://mail.proton.me/.../<id>. But it is best to disable conversation mode to get the correct id.

I think I have it. It looks like a base64-encoded string. What can I do next? Should I grep in ~/.local/state/protonmail/bridge/*/*.log for that ID?

petrmanek avatar Sep 19 '23 09:09 petrmanek

If anything is in the logs, you can try a grep by the first 10-12 characters of the ID.

LBeernaertProton avatar Sep 19 '23 10:09 LBeernaertProton

If anything is in the logs, you can try a grep by the first 10-12 characters of the ID.

Good, I got something. Don't know if it is useful, though. Note that I had to look in ~/.local/share/protonmail/bridge-v3/logs rather than ~/.local/state/protonmail/bridge.

filtered.log

petrmanek avatar Sep 19 '23 10:09 petrmanek

Nothing out of the ordinary I'm afraid. The message was created and received some updates (Flags, label/mbox operations).

LBeernaertProton avatar Sep 19 '23 11:09 LBeernaertProton

So far we have not received any reports about corrupted email messages. The only time we experience something similar was due to a problem in an IMAP client that failed to rebuild it's internal cache after UIDValidity change.

LBeernaertProton avatar Sep 19 '23 11:09 LBeernaertProton

Since upgrading to v3.6.1 I have not seen symptoms of this issue for a while, so I am closing it for now. If I ever encounter it again, I will reopen it and post more concrete logs.

petrmanek avatar Nov 08 '23 10:11 petrmanek

@LBeernaertProton Not sure how to approach this, but I have just observed the issue again. Searching for a USPS email with v3.6.1 and Thunderbird v115.5.0 brought up the following result:

20231203_19h27m56s_grim

As you can clearly see, this email is a weird mix of two messages, one from the US Postal Service (USPS), and other from Fermilab (fnal.gov). While the body appears to belong to the former, the Sender, From, To and Subject seem to belong to the latter. Furthermore, when opened the message body is complete gibberish, possibly part of some HTML message:

20231203_19h33m31s_grim

Further inspection reveals that the body actually appears to belong to a third email, which was an automated message originated from my self-hosted GitLab instance. Interestingly, revealing source code in Thunderbird yields the same view. Where normally the message would begin with email headers, I see the same HTML from the middle of a table (as in the previous screenshot). That being said, scrolling further down reveals headers of an email from USPS.

20231203_19h38m29s_grim

These appear valid and relevant, and are followed by message body, which is then again cut off at a seemingly arbitrary location in the middle of the HTML. This, however, appears only in Thunderbird. All three messages can be found uncorrupted in the webmail app.

I would be happy to provide further details (versions, logs), or run tests upon request. Please advise.

petrmanek avatar Dec 03 '23 19:12 petrmanek

@petrmanek Could you repeat the following steps:

  • Stop bridge
  • Start bridge without GUI with trace logs bridge -n -l=trace --log-imap=all > bridge.log
  • Repeat search operations

Note that this log will contain all the data send between the IMAP client and bridge. If you feel comfortable forwarding this, please send it to my email address.

Alternatively, I just need to inspect the the communication for the SEARCH and FETCH commands and their responses.

LBeernaertProton avatar Dec 04 '23 11:12 LBeernaertProton

@LBeernaertProton I have sent you the log file via email.

A couple of observations I made:

  • There are instances of FETCH but no SEARCH.
  • I looked at the tail of the log while I was searching. Nothing was emitted when I hit enter on mailbox-wide search in Thunderbird.

This leads me to believe that Thunderbird's search feature is somehow only inquiring its local database, not the Bridge. Would that be your conclusion as well? Can I somehow force Thunderbird to search the server rather than its local cache?

petrmanek avatar Dec 04 '23 14:12 petrmanek

In this case, it seems that thunderbird is performing the search.

What I would like you to check is if the fetch response have the garbled emails you see.

EDIT: you also need to set GLUON_LOG_IMAP_LINE_LIMIT=5000 to see the full imap responses.

LBeernaertProton avatar Dec 04 '23 14:12 LBeernaertProton

Right, here is what I tried:

  • I repeated the experiment with GLUON_LOG_IMAP_LINE_LIMIT=5000. This did not produce any new meaningful results, only more output. Specifically, message bodies were dumped into the log.
  • I tried more searches, including words that Thunderbird previously had not search for, to force a cache miss. This did not produce an extra FETCH or SEARCH calls in the log. It appears that Thunderbird is operating entirely on a local mailbox, and does not interact with the Bridge during search.

I am running out of ideas. Short of configuring Thunderbird to re-download my entire 15 GB inbox, I do not see a viable way to force this behavior to happen again. If only there was a way to force Thunderbird to clear specific email out local cache! Another option would be to run the Bridge continuously in the tracing mode for the foreseeable future and harvest the logs after the next incident.

What do you think?

petrmanek avatar Dec 05 '23 09:12 petrmanek

The thing I'm looking for is if the message bodies returned by the FETCH command after the search are not malformed.

Either one of these might be happening:

  • The data returned to Thunderbird is correct, but Thunderbird is mangling the Data
  • The data returned to Thunderbird is not correct, but is correctly stored on disk (transmission bug)
  • The data returned to Thunderbird is not correct, because it is not stored on disk properly

All my existing tests of my mailbox are working. The only way to determine what is happening on your machine is:

  • Find a message with a broken content
  • Locate the response in the logs which has that message to see what the message UID is and if the returned body is also malformed
  • Finally telnet into IMAP and manually request only that message to see if is also corrupted.

LBeernaertProton avatar Dec 05 '23 10:12 LBeernaertProton