php-imap icon indicating copy to clipboard operation
php-imap copied to clipboard

Issue when parsing header if the from field contains a semicolon

Open nilshellerhoff opened this issue 2 years ago • 3 comments

Describe the bug When parsing the header of an email where the from field contains a semicolon ";", the from field will not be parsed correctly. Minimal example of such an email:

To: [email protected]
Subject: Test of a semicolon in from-header
Date: Wed, 12 Oct 2022 16:31:06 +0000
From: "Foo; Bar" <[email protected]>
Message-ID: <[email protected]>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

<< email body >>

The raw email is appended to avoid issues with linebreaks (I changed the extension to .txt as Github doesn't support .eml). semicolon_test.txt

Used config Default.

Code to Reproduce

$raw_mail = file_get_contents('semicolon_test.txt');
$header = new \Webklex\PHPIMAP\Header($raw_mail);
var_dump($header->get('from'));

Output when running this via php test.php:

PHP Warning:  Trying to access array offset on value of type null in ***/vendor/webklex/php-imap/src/Header.php on line 457
PHP Warning:  Trying to access array offset on value of type null in ***/vendor/webklex/php-imap/src/Header.php on line 457
object(Webklex\PHPIMAP\Attribute)#18 (2) {
  ["name":protected]=>
  string(4) "from"
  ["values":protected]=>
  array(1) {
    [0]=>
    string(4) ""Foo"
  }
}

Expected behavior When we remove the semicolon from the from-header, we get the expected result:

PHP Warning:  Trying to access array offset on value of type null in ***/vendor/webklex/php-imap/src/Header.php on line 457
PHP Warning:  Trying to access array offset on value of type null in ***/vendor/webklex/php-imap/src/Header.php on line 457
object(Webklex\PHPIMAP\Attribute)#6 (2) {
  ["name":protected]=>
  string(4) "from"
  ["values":protected]=>
  array(1) {
    [0]=>
    object(Webklex\PHPIMAP\Address)#7 (5) {
      ["personal"]=>
      string(9) ""Foo Bar""
      ["mailbox"]=>
      string(6) "foobar"
      ["host"]=>
      string(10) "domain.tld"
      ["mail"]=>
      string(17) "[email protected]"
      ["full"]=>
      string(29) ""Foo Bar" <[email protected]>"
    }
  }
}

Screenshots If applicable, add screenshots to help explain your problem.

Desktop / Server (please complete the following information):

  • OS: Linux Mint 21 (= Ubuntu 22.04)

  • PHP: 7.4 and 8.1

  • Version 4.0.2

  • Provider: The mail which triggered the issue was sent through the "Contact Form 7" plugin on a Wordpress instance.

EDIT: I am not actually sure that a semicolon in the header fields confroms to the spec, but Gmail, Thunderbird and also phpmailer do handle these mails correctly.

nilshellerhoff avatar Oct 14 '22 16:10 nilshellerhoff

I have the same problem with subjects containing a semicolon. The subject content is removed after the semicolon.

Sample:

Subject: This is an example; for example

Results in:

echo $message->subject;
This is an example

ojgarciab avatar Oct 28 '22 10:10 ojgarciab

The problem originates here:

https://github.com/Webklex/php-imap/blob/45843e1554cc280c738278b9e3b6af35a91f8b1f/src/Header.php#L654-L686

Im not very versed in email handling, but when reading this (altough not an authoritative source ofcourse) it seems to me, that certain fields including subject ... should maybe be excluded from extension parsing, and additionally to checking for semicolons, the parser should actually only parse a field, if it finds a key=value pair after the semicolon.

Can you maybe comment on this @Webklex? I can also do a PR otherwise in the weekend.

nilshellerhoff avatar Nov 02 '22 17:11 nilshellerhoff

I have the same problem with subjects containing a semicolon. The subject content is removed after the semicolon. Subject : Test; ticket <<support-id=744482>>

abhilashpa39 avatar May 30 '24 10:05 abhilashpa39