zend-mail icon indicating copy to clipboard operation
zend-mail copied to clipboard

Content-Type header handling

Open hugeval opened this issue 9 years ago • 7 comments

Hi

There is a case when emails have Content-Type header, defined with keyword "ContentType", for example:

ContentType: text/html; charset="UTF-8"

In this case error appears:

Invalid header line for Content-Type string

Looking header fieldName normalization, I see that Content-Type, Content_Type, Content Type, ContentType will be normalized to contenttype. So contenttype will be stored in Zend\Mail\Headers::$headersKeys and appropriate header in Zend\Mail\Headers::$headers. Than on header loading pluginClassLoader will load Zend\Mail\Header\ContentType and method Zend\Mail\Header\ContentType::fromString() is called. But there is condition that do not accept 'contenttype` value.

It looks like inconsistent behavior. I understand that rfc2045 accepts Content-Type keyword only, but in real cases other keywords can appear.

hugeval avatar Aug 12 '16 01:08 hugeval

Good catch :+1: Definitely looks to me like a strictness mismatch between the two classes: ContentType::fromString does not normalize the keyword while the Headers bag does. I'm not familiar enough with Zend\Mail to know which is the intended behaviour, but it should definitely be consistent. Tecnically, only Content-Type is valid and the others should be ignored, but our world is not that perfect :stuck_out_tongue:

I spot-checked a few other Headers classes for headers with hyphenated names and those also do not do normalization and so would have the same problem.

adamlundrigan avatar Aug 12 '16 12:08 adamlundrigan

I'm not sure we should support the contentType header. I just searched through a corpus of about a million emails sent over the past 5 days. And found only 5 emails containing the 'contentType' header. All from the same server and domain, and all mails with the contentType header also had a Content-Type header (with an identical value).

I could search in a much larger corpus as well, but this server is handling email for about a 100K domains, so I'd say it's fairly representative.

Freeaqingme avatar Aug 12 '16 17:08 Freeaqingme

@Freeaqingme , I'm not sure, but it looks like, mails sent using Microsoft Outlook have both Content-Type and ContentType headers in each email.

hugeval avatar Aug 12 '16 18:08 hugeval

@Freeaqingme how about other headers this class normalizes the field names of? Could we drop that normalization for 3.0? Right now it's just shifting the problem from the header bag into the individual header classes (which reject the invalid field names).

adamlundrigan avatar Aug 19 '16 22:08 adamlundrigan

We are also experiencing this issue. Mail which is sent by Wrike also contains Content-Type and Contenttype

fasterforward avatar Nov 08 '16 09:11 fasterforward

This is happening for headers other than Content-Type, FYI. I ran into this problem with Mime-Version coming in as MimeVersion. I implemented it in my fork, here's an example: https://github.com/ParticleBits/zend-mail/commit/96a8a4a3b0803a05ba1665e87e5a3a4a255ffcff

I don't know if this would be the preferred fix for this codebase but if it is, it's a minor PR.

mikegioia avatar Feb 25 '17 18:02 mikegioia

This repository has been closed and moved to laminas/laminas-mail; a new issue has been opened at https://github.com/laminas/laminas-mail/issues/51.

michalbundyra avatar Jan 15 '20 19:01 michalbundyra