Mail body search doesn't work with non-latin characters
Steps to reproduce
- Make sure there's at least one mail with non-latin text in it (e.g. hangul or cyrillic) in your mailbox
- Try to find this letter by its body text using simple or advanced search
Expected behavior
Search query returns email with corresponding text
Actual behavior
Search query returns no results
Mail app version
3.6.0-alpha.3
Mailserver or service
Zimbra
Operating system
Windows 10
PHP engine version
PHP 8.2
Web server
Nginx
Database
PostgreSQL
Additional info
Thanks to https://github.com/nextcloud/mail/pull/8897 now we have mail body search, but it seems that it works correctly only with latin characters.
3.6.0-alpha.3
You must run a developer setup. Try again with enabled debug mode. Inspect data/horde_imap.log. Try to locate the body search. Check if the non-latin character is sent correctly and processed by the IMAP server.
Check if the non-latin character is sent correctly and processed by the IMAP server.
@ChristophWurst it seems that on the backend side NC doesn't allow non-ASCII characters:
For instance, here's a search query link by thั word "ัะตัั":
http://localhost/index.php/apps/mail/api/messages?mailboxId=1&filter=body:%25D1%2582%25D0%25B5%25D1%2581%25D1%2582&limit=20
And here comes the problem:
Did a quick PoC and this appears to be fixable in \OCA\Mail\IMAP\Search\Provider::convertMailQueryToHordeQuery if we tell Horde to search with charset UTF-8.
diff --git a/lib/IMAP/Search/Provider.php b/lib/IMAP/Search/Provider.php
index bd4fd9247..a882095ae 100644
--- a/lib/IMAP/Search/Provider.php
+++ b/lib/IMAP/Search/Provider.php
@@ -76,6 +76,7 @@ class Provider {
$searchQuery->getBodies(),
static function (Horde_Imap_Client_Search_Query $query, string $textToken) {
$query->text($textToken, true);
+ $query->charset('UTF-8');
return $query;
},
new Horde_Imap_Client_Search_Query()