django-mailbox icon indicating copy to clipboard operation
django-mailbox copied to clipboard

MessageAttachment not created for message/rfc822 part despite of Content-Disposition: attachment

Open asitrax opened this issue 9 years ago • 2 comments

Incoming multipart email containing text/html, text/plain and message/rfc822 (with Content-Disposition set to attachment). I would expect to get the message/rfc822 part saved as a MessageAttachment and get an .eml file for it. What happens is instead that the message/rfc822 part gets parsed recursively and it's "merged" into the containing message.

I'm not sure this is the intended behavior or just a corner case. Would it be possible to have this behavior adjusted or configurable?

Below an attempt i made to fix it while trying to understand where the problem was. I'm not very comfortable with email standards so I would really be grateful if someone with more experience could review it and possibly include a fix in one of the next releases.

I pointed out with "EDIT" the changes I made (the first is just one line the second is an elif branch I added).

    def _get_dehydrated_message(self, msg, record):
        settings = utils.get_settings()

        new = EmailMessage()
        # EDIT: I added the Content-Disposition check to avoid the recursive parse of attached message/rfc822
        if msg.is_multipart() and not 'attachment' in msg.get('Content-Disposition', ''):
            for header, value in msg.items():
                new[header] = value
            for part in msg.get_payload():
                new.attach(
                    self._get_dehydrated_message(part, record)
                )
        elif (
            settings['strip_unallowed_mimetypes']
            and not msg.get_content_type() in settings['allowed_mimetypes']
        ):
            for header, value in msg.items():
                new[header] = value
            # Delete header, otherwise when attempting to  deserialize the
            # payload, it will be expecting a body for this.
            del new['Content-Transfer-Encoding']
            new[settings['altered_message_header']] = (
                'Stripped; Content type %s not allowed' % (
                    msg.get_content_type()
                )
            )
            new.set_payload('')
        # EDIT: added this elif branch (which is very similar to the attachment and could be refactored)
        elif msg.get_content_type() == 'message/rfc822' and 'attachment' in msg.get('Content-Disposition', ''):
            extension = ".eml"
            attachment = MessageAttachment()
            # can 'message/rfc822' have anything else than one single part? if so this is not safe and needs to be handled properly
            msg_part = msg.get_payload(0)

            attachment.document.save(
                uuid.uuid4().hex + extension,
                ContentFile(
                    six.BytesIO(
                        msg_part.as_string(unixfrom=False) # see django-mailbox issue #100
                    ).getvalue()
                )
            )
            attachment.message = record
            for key, value in msg.items():
                attachment[key] = value
            attachment.save()

            placeholder = EmailMessage()
            placeholder[
                settings['attachment_interpolation_header']
            ] = str(attachment.pk)
            new = placeholder
        elif (
            (
                msg.get_content_type() not in settings['text_stored_mimetypes']
            ) or
            ('attachment' in msg.get('Content-Disposition', ''))
        ):
            filename = None
            # ... original code ...

asitrax avatar Jun 17 '16 16:06 asitrax

Ooo, this is definitely a bug; could you post this as a pull request so I can more-easily review it? If you're not super familiar with how that works, I can help walk you through the process if you hop into the gitter.im channel.

Cheers!

coddingtonbear avatar Jun 17 '16 16:06 coddingtonbear

Thanks for the quick reply! Pull request submitted.

asitrax avatar Jun 20 '16 09:06 asitrax