glpi icon indicating copy to clipboard operation
glpi copied to clipboard

Mail Collector: Issue with Parsing Emails Containing Multiple <body> Tags

Open sothmani opened this issue 9 months ago • 3 comments

Code of Conduct

  • [x] I agree to follow this project's Code of Conduct

Is there an existing issue for this?

  • [x] I have searched the existing issues

Version

10.0.18

Bug description

Title: Mail Collector: Issue with Parsing Emails Containing Multiple

Tags

Description:

Hello,

We're facing an issue with the Mail Collector component during the processing of incoming emails for ticket creation and responses. Over the past two months, users have started receiving empty tickets or replies.

Root Cause We traced the problem to a change in the structure of incoming emails. Our organization uses PhishAlarm, which recently started injecting additional HTML content, including its own

tag. This results in emails containing multiple tags.

The current implementation of Mail Collector expects a single

tag, leading to incorrect parsing and loss of content.

Current Code (Before):

$body_matches = [];
if (preg_match('/<body[^>]*>\s*(?<body>.+?)\s*<\/body>/is', $content, $body_matches) === 1) {
    $content = $body_matches['body'];
}

This code only matches one

tag and assumes it's the right one.

Temporary Fix (Now):

$body_matches = [];
if (preg_match('/<body[^>]*>\s*(?<body>.+?)\s*<\/body>/is', $content, $body_matches) === 2) {
    $content = $body_matches['body'];
}

By changing the expected match count from 1 to 2, we can capture the second

tag, which in our case contains the actual message content. This workaround has restored correct behavior for now.

Suggestion This fix is brittle and environment-specific. Instead, we think it's better that glpi can Support multiple

tags

A more resilient solution would help support email structures from various third-party tools without requiring manual code adjustments.

Thank you

Relevant log output


Page URL

No response

Steps To reproduce

No response

Your GLPI setup information

No response

Anything else?

No response

sothmani avatar May 23 '25 14:05 sothmani

HTML with multiple body tags isn't valid HTML. If GLPI changes to guess how a malformed HTML document is intended to function, I think this will just lead to more issues and complexity.

This sounds more like an issue with the browser/email client extension.

cconard96 avatar May 25 '25 13:05 cconard96

if (preg_match('/<body[^>]*>\s*(?<body>.+?)\s*<\/body>/is', $content, $body_matches) === 2) { will never be true. It means that content will not be filtered. I do not really know how we should handle the presence of multiple body tags. Maybe we should concat all the <body> contens into a single one.

cedric-anne avatar May 26 '25 07:05 cedric-anne

This sounds more like an issue with the browser/email client extension.

I totally agree; a bug report should be send to "PhishAlarm".

trasher avatar Jun 02 '25 08:06 trasher