php-imap
php-imap copied to clipboard
Problem with subject =?UTF-8?Q? (2)
Describe the bug When I'm parsing the e-mail message, I'm facing this issue when trying to getSubject().
Used config I'm using default package config.
Code to Reproduce The troubling code section which produces the reported bug.
$client = Client::account('default');
$client->connect();
$inbox = $client->getFolderByPath('INBOX');
$query = $inbox->messages();
$messages = $query->where(['UNSEEN'])->from($email)->get();
foreach ($messages as $message) {
echo $message->getSubject();
}
Expected behavior 99% of cases this script work, but to an specific message it's returing the subject as:
=?utf-8?Q?Confirmaci=C3=B3n_reserva_Free_Tour_Flo?= =?utf-8?Q?rencia_Esencial_-_Buendiatours.com?=
Desktop / Server (please complete the following information):
- OS: Windows 11 Enterprise
- PHP: XAMPP / PHP 8.2.7
- Version v5.3
- Provider Hetzner Online GmbH
@Webklex I tried what you said but no solutions.. I'm available to explore the issue but all did I try didn't worked.
Hi @paulocardozo, thanks a lot for reporting this issue here. I really appreciate it!
Since you are using a windows environment, I'm intriqued if it's related to #413. If you try to access the text body of a message - do you actually receive it or are the headers included as well?
Additionally, please donate an anonymized version of the troubling mail. This will allow me to create a dedicated test case for this issue. anonymized = remove all personal information you don't want to share with the world :)
Once again, thanks for taking the time and effort to make this library better!
Best regards and happy coding,
@Webklex I can provide you in private
@Webklex It seems be a local problem.. I'm creating from scratch an application that verify constantly a mailbox, in the older version of application, your packege is working as well, but on newer not.. I'll further investigate here and post as soon as I have news.
Thanks!
@Webklex I just noticed, the same code works on unix server (Hostinger), but isn't working locally on Windows. It's seems very strange to me, but still investigating..
Hi @paulocardozo , thanks for the followups. This sounds indeed interesting - but what could it possibly be? Perhaps different default php mods?
@Webklex I really dont know.. I've checked all php.ini from versions installed here and nothing seems wrong..
It's the old version, working..
Working (Old) - ExtractBookingEmailsJob.zip
It's the newest version, not working..
Honestly, I have no idea right now.. If anything regarding this pops in my mind I'll let you know for sure. Thanks again for your help!
@Webklex Well, as I'm very delayed in a project, I've tested the solution above, it's related to https://github.com/Webklex/php-imap/issues/410#issuecomment-1608876508 and it worked.
`private static function decodeSubject($subject) { $parts = preg_match_all("/(=?[^?]+?[BQ]?)([^?]+)(?=)[\r\n\t ]*/i", $subject, $m);
$joined_parts = '';
if (count($m[1]) > 1 && !empty($m[2])) {
// Example: GyRCQGlNVTtZRTkhIT4uTlMbKEI=
$joined_parts = $m[1][0].implode('', $m[2]).$m[3][0];
$subject_decoded = iconv_mime_decode($joined_parts, ICONV_MIME_DECODE_CONTINUE_ON_ERROR, "UTF-8");
if ($subject_decoded && trim($subject_decoded) != trim(rtrim($joined_parts, '='))) {
return $subject_decoded;
}
}
// iconv_mime_decode() can't decode:
// =?iso-2022-jp?B?IBskQiFaSEcyPDpuQC4wTU1qIVs3Mkp2JSIlLyU3JSItahsoQg==?=
$subject_decoded = iconv_mime_decode($subject, ICONV_MIME_DECODE_CONTINUE_ON_ERROR, "UTF-8");
// Sometimes iconv_mime_decode() can't decode some parts of the subject:
// =?iso-2022-jp?B?IBskQiFaSEcyPDpuQC4wTU1qIVs3Mkp2JSIlLyU3JSItahsoQg==?=
// =?iso-2022-jp?B?GyRCQGlNVTtZRTkhIT4uTlMbKEI=?=
if (preg_match_all("/=\?[^\?]+\?[BQ]\?/i", $subject_decoded)) {
$subject_decoded = \imap_utf8($subject);
}
if (!$subject_decoded) {
$subject_decoded = $subject;
}
return $subject_decoded;
}`
FYI. In our project we've completely replaced $this->decode($header->subject)
function with the one we've developed (see https://github.com/Webklex/php-imap/issues/410) because current solution ($this->decode()
) often is not able to decode subject properly:
https://github.com/freescout-helpdesk/freescout/blob/dist/overrides/webklex/php-imap/src/Header.php#L208
And it works like a charm now. So we have not seen any subject which this function could not decode.
I have version 5.5 and the MailHelper library does not exist. Why? I have the same problem, some email subjects don't go well.
I have created a function to parse subject:
`private static function decodeSubject($subject) {
$parts = preg_match_all("/(=\?[^\?]+\?[BQ]\?)([^\?]+)(\?=)[\r\n\t ]*/i", $subject, $m);
$joined_parts = '';
if (count($m[1]) > 1 && !empty($m[2])) {
// Example: GyRCQGlNVTtZRTkhIT4uTlMbKEI=
$joined_parts = $m[1][0] . implode('', $m[2]) . $m[3][0];
$subject_decoded = iconv_mime_decode($joined_parts, ICONV_MIME_DECODE_CONTINUE_ON_ERROR, "UTF-8");
if ($subject_decoded && trim($subject_decoded) != trim(rtrim($joined_parts, '='))) {
return $subject_decoded;
}
}
// iconv_mime_decode() can't decode:
// =?iso-2022-jp?B?IBskQiFaSEcyPDpuQC4wTU1qIVs3Mkp2JSIlLyU3JSItahsoQg==?=
$subject_decoded = iconv_mime_decode($subject, ICONV_MIME_DECODE_CONTINUE_ON_ERROR, "UTF-8");
// Sometimes iconv_mime_decode() can't decode some parts of the subject:
// =?iso-2022-jp?B?IBskQiFaSEcyPDpuQC4wTU1qIVs3Mkp2JSIlLyU3JSItahsoQg==?=
// =?iso-2022-jp?B?GyRCQGlNVTtZRTkhIT4uTlMbKEI=?=
if (preg_match_all("/=\?[^\?]+\?[BQ]\?/i", $subject_decoded)) {
$subject_decoded = \imap_utf8($subject);
}
if (!$subject_decoded) {
$subject_decoded = $subject;
}
return $subject_decoded;
}`
Then I do that.
Wonderful, works perfectly, thank you very much. We'll hope the imap-php library fixes this in the future.
Hi guys, sometimes the $message->getSubject() method returns text in quoted-printable. How can I detect the format of the email subject?
this paulocardozo decodeSubject function id doing pretty good job with my test set of subjects, but cannot decode this one:
=?UTF-8?B?VGlja2V0IE5vOiBb7aC97bOpMTddIE1haWxib3ggSW5ib3ggLSAoMTcpIEluY29taW5nIGZhaWxlZCBtZXNzYWdlcw==?=
I had issue with it using some modified Roundcubemail methods (my 10 years old solution), and I had to make some modifications for this subject. Finally I got a solution that decode it to valid UTF-8 string, but it's complicated and I need a better one with Webklex/php-imap. Anybody can modify this nice paulocardozo function above to work with this subject?
this paulocardozo decodeSubject function id doing pretty good job with my test set of subjects, but cannot decode this one:
=?UTF-8?B?VGlja2V0IE5vOiBb7aC97bOpMTddIE1haWxib3ggSW5ib3ggLSAoMTcpIEluY29taW5nIGZhaWxlZCBtZXNzYWdlcw==?=
We've just checked this subject with the latest version of decodeSubject() function, it was decoded into:
Ticket No: [??????17] Mailbox Inbox - (17) Incoming failed messages
We've just checked this subject with the latest version of decodeSubject() function, it was decoded into:
Ticket No: [??????17] Mailbox Inbox - (17) Incoming failed messages
Great. That's correct subject. In original encoded subject there are a couple of invalid utf-8 characters and that function is replacing them with question mark.
Thanks!