pyzmail icon indicating copy to clipboard operation
pyzmail copied to clipboard

issue extracting attachments with improper boundary

Open mlaferrera opened this issue 9 years ago • 1 comments

I recently ran across an email where the attachment could not be extracted using pzymail. The issue appears to be with the parsing of boundaries and over relying on them to extract content. Below is an example that will not extract the attachment.

To: [email protected]
Subject: Testing
Message-ID: <[email protected]>
Return-Path: [email protected]
Date: Tue, 06 Oct 2015 11:25:00 +0000
From: "testing" <[email protected]>
MIME-Version: 1.0
Content-Type: multipart/mixed; charset="UTF-8"; boundary="b1_000001"
Content-Transfer-Encoding: 8bit
Content-Disposition: inline

--b1_000001
Content-Type: multipart/alternative;
    boundary="b3_000001"

--b3_000001
Content-Type: text/plain; format=flowed; charset="UTF-8"
Content-Transfer-Encoding: 8bit

testing

--b3_000001
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: 8bit

<html>
<head>
</head>
<body>
testing
</body>
</html>

--b3_000001--
--b1_000002
Content-Type: application/octet-stream;
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
 filename="file.txt"

VGhpcyBpcyBhIHRlc3QgZmlsZS4K

--b1_000002--

mlaferrera avatar Oct 06 '15 19:10 mlaferrera

I typically use ThunderBird as my baseline to answer the question.. "should this incorrectly formatted email parse correctly". In this case Thunderbird does not parse out the file.txt attachment either.

Of course we could test a dozen other email clients but this is at least a single data point.

giovino avatar Apr 06 '16 15:04 giovino