mail-parser
mail-parser copied to clipboard
RFC8621 nonconformance
When parsing the example legacy/034.eml
the returned parts are
html_body: [3, 4, 5], text_body: [2, 4, 5], attachments: [4, 5]
This seems at odds with RFC8621
o attachments: "EmailBodyPart[]" (immutable)
A list, traversing depth-first, of all parts in "bodyStructure"
that satisfy either of the following conditions:
* not of type "multipart/*" and not included in "textBody" or
"htmlBody"
* of type "image/*", "audio/*", or "video/*" and not in both
"textBody" and "htmlBody"
None of these parts include subParts, including "message/*" types.
Attached messages may be fetched using the "Email/parse" method
and the "blobId".
Note that a "text/html" body part [[HTML](https://datatracker.ietf.org/doc/html/rfc8621#ref-HTML)] may reference image parts
in attachments by using "cid:" links to reference the Content-Id,
as defined in [[RFC2392](https://datatracker.ietf.org/doc/html/rfc2392)], or by referencing the Content-Location.
Attachments 4, 5 are image/png and fit neither of the criteria listed.
I'm not sure why the RFC includes the second criterion though. Prior to that, the criteria for textBody
and htmlBody
are
o textBody: "EmailBodyPart[]" (immutable)
A list of "text/plain", "text/html", "image/*", "audio/*", and/or
"video/*" parts to display (sequentially) as the message body,
with a preference for "text/plain" when alternative versions are
available.
o htmlBody: "EmailBodyPart[]" (immutable)
A list of "text/plain", "text/html", "image/*", "audio/*", and/or
"video/*" parts to display (sequentially) as the message body,
with a preference for "text/html" when alternative versions are
available.
This seems to suggest image/*
, audio/*
and video/*
are always in both textBody
and htmlBody
, so the second condition for something being an attachment would always be false.