mail-mime-parser icon indicating copy to clipboard operation
mail-mime-parser copied to clipboard

Request to improve the comment retrieval feature from Received Header

Open mariuszkrzaczkowski opened this issue 5 years ago • 11 comments

Referring to issues https://github.com/zbateson/mail-mime-parser/issues/152 It will be very helpful to be able to differentiate the comments from which parts they come from

an example of how it is now Array ( [0] => [104.206.174.27] helo=mail.integralstock.com [1] => Exim 4.94 [2] => envelope-from )

an example of how could be useful Array ( 'from' => [104.206.174.27] helo=mail.integralstock.com 'by' => Array ( [1] => Exim 4.94 [2] => envelope-from ) ) https://github.com/zbateson/mail-mime-parser/blob/4ebaf86ac247571d85797c5b7caf670c6a359142/src/Header/ReceivedHeader.php#L112-L113

mariuszkrzaczkowski avatar Dec 11 '20 11:12 mariuszkrzaczkowski

I think I found a place where it is created. https://github.com/zbateson/mail-mime-parser/blob/4ebaf86ac247571d85797c5b7caf670c6a359142/src/Header/Consumer/CommentConsumer.php#L106-L121

mariuszkrzaczkowski avatar Dec 11 '20 11:12 mariuszkrzaczkowski

Hi @mariuszkrzaczkowski --

In your case it would make more sense to look at the parts using 'getParts'.

The comment parts aren't necessarily 'part' of a received part, I don't think that's a guarantee, but you could look at which 'received part' came last, for example:

$parts = $receivedHeader->getParts();
$last = null;
foreach ($parts as $p) {
  if ($p instanceof 'ZBateson\MailMimeParser\Header\Part\ReceivedPart') {
    $last = $p;
  } else if ($p instanceof '...CommentPart') {
    // use $last to figure out what the last ReceivedPart was
  }
}

zbateson avatar Dec 11 '20 18:12 zbateson

This example seems to be incorrect Fatal error: Uncaught Error: Call to undefined method ZBateson\MailMimeParser\Header\ReceivedHeader::getAllParts() in

mariuszkrzaczkowski avatar Dec 11 '20 22:12 mariuszkrzaczkowski

My bad, it's just getParts. I've updated the previous comment/example as well.

zbateson avatar Dec 11 '20 22:12 zbateson

the example works, I just miss information about the part, e.g. from, by which comment is related

mariuszkrzaczkowski avatar Dec 11 '20 22:12 mariuszkrzaczkowski

you are right thanks to your example

mariuszkrzaczkowski avatar Dec 11 '20 22:12 mariuszkrzaczkowski

maybe someone will need the full code, do you think such a function would be useful in the library?

$parts = $received->getParts();
$comment = [];
$lastReceivedPart = null;
foreach ($parts as $p) {
	if ($p instanceof \ZBateson\MailMimeParser\Header\Part\ReceivedPart) {
		$lastReceivedPart = $p->getName();
	} elseif ($p instanceof \ZBateson\MailMimeParser\Header\Part\CommentPart) {
		$comment[$lastReceivedPart][] = $p->getComment();
	}
}
print_r($comment);

result

Array
(
    [id] => Array
        (
            [0] => envelope-from 
        )

)
Array
(
    [from] => Array
        (
            [0] => [104.206.174.27] helo=mail.integralstock.com
        )

    [with] => Array
        (
            [0] => Exim 4.94
            [1] => envelope-from 
        )

)

mariuszkrzaczkowski avatar Dec 11 '20 22:12 mariuszkrzaczkowski

Maybe it's worth adding features to ZBateson\MailMimeParser\Header\ReceivedHeader

public function getCommentsByType():array
{
	$comment = [];
	$last = null;
	foreach ($this->getParts() as $p) {
		if ($p instanceof \ZBateson\MailMimeParser\Header\Part\ReceivedPart) {
			$last = $p->getName();
		} elseif ($p instanceof \ZBateson\MailMimeParser\Header\Part\CommentPart) {
			$comment[$last][] = $p->getComment();
		}
	}
	return $comment;
}

mariuszkrzaczkowski avatar Dec 12 '20 21:12 mariuszkrzaczkowski

Actually it's already kind of separated and parsed in the various 'Consumer' classes already. Currently the comment portion is returned separately if it doesn't match what the consumer expects to parse into its various parts. You can see that here:

https://github.com/zbateson/mail-mime-parser/blob/674795deb8c8043746a69885b75475ce94f20925/src/Header/Consumer/Received/DomainConsumer.php#L110-L123

It returns separate parts for the comment and domain part (if the comment wasn't matched to what's expected).

zbateson avatar Dec 13 '20 22:12 zbateson

That could be changed to be both separate and part of the 'ReceivedDomainPart' maybe, so if it's not matched at least the raw value is available or something... not sure, will have to think about it a little and make sure that doesn't break something else.

zbateson avatar Dec 13 '20 22:12 zbateson

it's nice as if you can get to it from the ReceivedHeader object, i.e. to DomainConsumer which were not compliant with the rule

mariuszkrzaczkowski avatar Dec 14 '20 07:12 mariuszkrzaczkowski

I'm not sure what the status of this one is... with #153 merged and #152 fixed is this still an issue? Feel free to reopen.

zbateson avatar Feb 14 '23 22:02 zbateson