mailcow-dockerized
mailcow-dockerized copied to clipboard
Incorrect encoding in non-latin quarantined mails, also after release
Prior to placing the issue, please check following: (fill out each checkbox with an X
once done)
- [X] I understand that not following or deleting the below instructions will result in immediate closure and/or deletion of my issue.
- [X] I have understood that this bug report is dedicated for bugs, and not for support-related inquiries.
- [X] I have understood that answers are voluntary and community-driven, and not commercial support.
- [X] I have verified that my issue has not been already answered in the past. I also checked previous issues.
Summary
Mailcow commit a832becbd530603710a823be526a9ec4d9f1f89d If the email in Windows-1251 encoding (others may be affected as well) gets quarantined, its text does not show correctly in quarantine web interface, and email remains unreadable after release.
Logs
brokenmails.zip These are two exact emails, one of which is in correct encoding which was exported from junk folder, another is what quarantine release delivered to inbox.
Reproduction
- Get quarantined email in Russian, with Windows-1251 encoding
- Try to release the email
- Receive unreadable email in inbox
Unfortunately I no longer can show you a screenshot of quarantine web interface because I learned similar emails as ham and they no longer go to quarantine.
System information
Question | Answer |
---|---|
My operating system | Linux Ubuntu 20.04 |
Is Apparmor, SELinux or similar active? | Yes, AppArmor. No issues with it in audit logs. |
Virtualization technlogy (KVM, VMware, Xen, etc - LXC and OpenVZ are not supported | Bare metal |
Server/VM specifications (Memory, CPU Cores) | 4 cores, 16 GB RAM |
Docker Version (docker version ) |
20.10.1 |
Docker-Compose Version (docker-compose version ) |
1.27.4, build 40524192 |
Reverse proxy (custom solution) | Custom configuration, did not touch Mailcow configs, irrelevant |
- Output of
git diff origin/master
, any other changes to the code? No. - All third-party firewalls and custom iptables rules are unsupported. Please check the Docker docs about how to use Docker with your own ruleset. Nevertheless, iptabels output can help us to help you:
iptables -L -vn
,ip6tables -L -vn
,iptables -L -vn -t nat
andip6tables -L -vn -t nat
. - DNS problems? Please run
docker exec -it $(docker ps -qf name=acme-mailcow) dig +short stackoverflow.com @172.22.1.254
(set the IP accordingly, if you changed the internal mailcow network) and post the output.
You don't have a db dump anymore, right?
Or any other mail with that problem currently in your quarantine?
@andryyy I've used password recovery and got the message in quarantine, it's broken. How should I proceed?
Dunno, the mails seem to have encoding problems in general. :/
Try to receive new post notification. It seems that registration/password reminding letters don't have space between some header name and value, but post notifications have them.
Please give me more time for this.
I think the mail encoding is a bit messed up, but I'm not sure yet...
The subject seems to be read as UTF-8 (perhaps?). Not sure.
Here's the original email, notification of new forum message. The one which is the first post is ruboard в†’ [email protected] (mail.ru) в†’ [email protected] (mailcow)
. This one is from [email protected] mailbox.
As you can see, the subject has encoding and is in Windows-1251, but Content-type
header has no space between its name and value: Content-type:text/plain;charset=Windows-1251
. Maybe that's an issue.
Message16103458220307880921.zip
Remember password message on the contrary have proper Content-Type: text/plain; charset=Windows-1251
(with space), but no encoding in Subject: Subject: Забыли пароль?
.
That's a good catch. :) I will check that.
Here's another broken message, this time from Google Groups.
message.zip
This message contains strange ОÑ
symbols in the header, near To field. This is what Google sends for some reason (it persist in older messages as well).
X-BeenThere: [email protected]
Received: by 2002:a1c:2e50:: with SMTP id u77ls1076220wmu.2.canary-gmail; Tue,
08 Dec 2020 05:02:21 -0800 (PST)
X-Received: by 2002:a05:600c:268b:: with SMTP id 11mr3827005wmt.78.1607432541168;
Tue, 08 Dec 2020 05:02:21 -0800 (PST)
MIME-Version: 1.0
To: =?UTF-8?B?0JzQvtC00LXRgNCw0YLQvtGA0Ysg0YHQv9Cw0LzQsA==?= <[email protected]>
От: [email protected]
Subject: =?UTF-8?B?W2FjXSDQntGC0YfQtdGCINC80L7QtNC10YDQsNGC0L7RgNCwINC+INGB0L/QsNC80LUg?=
=?UTF-8?B?0LIg0LPRgNGD0L/Qv9C1IGFudGljZW5zb3JpdHlAZ29vZ2xlZ3JvdXBzLmNvbQ==?=
Message-ID: <[email protected]>
Date: Tue, 08 Dec 2020 13:02:21 +0000
Content-Type: text/plain; charset="UTF-8"
I only see those content type fails with russian mail. And not even all. One needs to check wether they are correctly encoded/formatted and if we really want to work that around if they are not.
The previous "originals" also messed up my local mail client.
I work every day with Cyrillic, postfix handle all correctly. This issue just on sender side and don't think there actually must be/can be any fix for sender who send mail with incorrect mime type/encoding from his side.
And when you create contacts in Russian, are they displayed correctly? I have question marks instead of Russian letters. I myself am looking for an answer to this problem. In the demo on the Sogo website, and on mailcow, everything is OK, but in my installation ????? such signs
А у тебя контакты когда на Русском создаешь, корректно отображаются? У меня вопросительные знаки вместо Русских букв. Сам ищу ответ на эту проблему. В демке на сайте Sogo и на mailcow все ок, а вот в моей установке ????? такие знаки
Do you use an external SQL?
Do you use an external SQL?
no, I have an official docker compose. 19 containers.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
The issue is still not fixed, please reopen. I can provide fresh .eml files.
@andryyy, I also can provide database dumps. Not removing the quarantine data yet.
That would be great. Can you mail to @.*** ?
I will need some time though as I’m currently in hospital.
Am 01.01.2022 um 19:57 schrieb ValdikSS @.***>:
@andryyy, I also can provide database dumps. Not removing the quarantine data yet.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.
The email is not shown. Please mail me at [email protected], I'll mail you back.
If these errors only happen with the same wrongly encoded mails from your previously sent items I will not work on it. The sender will need to fix their issues then as it was stated before.
I don't think we are responsible to fix that. :/
Drago works with Russian mail all the time. It is fine for him. Your example mail was totally broke.
No, this time it's a Google Groups email. And others, I need to check.
Please carry me on what I should do. Right now the email looks like this:
Haha, this email has Russian "От:" in the email header instead of "From:".
Haha, this email has Russian "От:" in the email header instead of "From:".
(facepalm) omg 😱
So right now there are two issues with Mailcow:
- The quarantine system breaks the headers on the first non-7-bit-ascii symbol and not on \r\n\r\n
- The message is re-encoded when entering quarantine and when released, that's why broken mails are released broken after quarantine.
For 1) mailcow should split headers from the body by searching \r\n\r\n, and for the 2) mailcow should not assume encoding and treat emails as a sequence of bytes, at least for releasing.
We use a very popular mail parser. I think your mails are a bit off.
Mit besten Grüßen André Peters
Am 02.01.2022 um 11:45 schrieb ValdikSS @.***>:
So right now there are two issues with Mailcow:
The quarantine system breaks the headers on the first non-8-bit-ascii symbol and not on \r\n\r\n The message is re-encoded when entering quarantine and when released, that's why broken mails are released broken after quarantine. For 1) mailcow should split headers from the body by searching \r\n\r\n, and for the 2) mailcow should not assume encoding and treat emails as a sequence of bytes, at least for releasing.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.
We use a very popular mail parser. I think your mails are a bit off.
Sure they are, but this shows that even such monsters as Google could make program error and include translated string into the headers.
It'll be handy to have a more loyal parser for more compatibility with broken emails.
Not sure mail header could be at all written on non-latin. I really not see how it should be parsed, by rspamd as well. If you sql dump this email can you send it to me? In telegram for example, I can't fix it, but wanted to look. I never faced such emails.
Here's the original .eml [ac] Отчет модератора о спаме в группе [email protected] - [email protected] - 2020-12-08 1602.zip
Quite old email, they even signed "От" in dkim... Rspand can't get it as well