mail-parser icon indicating copy to clipboard operation
mail-parser copied to clipboard

mail.attachment

Open vmalguy opened this issue 7 years ago • 12 comments
trafficstars

I thing their is an issue with mail.attachments. It dont return any attachement.

This is my test case in ipython3

import mailparser

a="""Received: from EX3.local (172.16.2.3) by EX3.local (172.16.2.3) with
 Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1415.2 via Mailbox
 Transport; Fri, 12 Jan 2018 12:48:13 +0100
Received: from [10.42.106.119] (10.10.254.33) by EX3.local (172.16.2.3)
 with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1415.2; Fri, 12
 Jan 2018 12:48:13 +0100
To: Vincent  <[email protected]>
From: Simon  <[email protected]>
Subject: test eml attachment
Message-ID: <[email protected]>
Date: Fri, 12 Jan 2018 12:46:42 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
 Thunderbird/52.4.0
Content-Type: multipart/mixed;
    boundary="------------55D5C868D71CE57DC948563F"
Content-Language: en-US
Return-Path: [email protected]
X-MS-Exchange-Organization-Network-Message-Id: 6a810f66-4b65-4a5f-da82-08d559b25c95
X-MS-Exchange-Organization-AuthSource: EX3.local
X-MS-Exchange-Organization-AuthAs: Internal
X-MS-Exchange-Organization-AuthMechanism: 07
X-Originating-IP: [10.10.254.33]
X-ClientProxiedBy: ex2.local (172.16.2.2) To EX3.local (172.16.2.3)
X-MS-Exchange-Transport-EndToEndLatency: 00:00:00.2112587
X-MS-Exchange-Processed-By-BccFoldering: 15.01.1415.002
MIME-Version: 1.0

--------------55D5C868D71CE57DC948563F
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit

test


--------------55D5C868D71CE57DC948563F
Content-Type: message/rfc822; name="Attached Message"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="Attached Message"

Received: from ex2.local (172.16.2.2) by EX3.local (172.16.2.3) with
 Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1415.2 via Mailbox
 Transport; Fri, 12 Jan 2018 12:47:07 +0100
Received: from EX3.local (172.16.2.3) by EX2.local (172.16.2.2) with
 Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1415.2; Fri, 12
 Jan 2018 12:47:07 +0100
Received: from EX3.local ([fe80::7806:9f98:c0b4:db47]) by EX3.local
 ([fe80::7806:9f98:c0b4:db47%13]) with mapi id 15.01.1415.002; Fri, 12 Jan
 2018 12:47:07 +0100
From: Vincent  <[email protected]>
To: Simon  <[email protected]>
Subject: testing attachement
Thread-Topic: testing attachement
Thread-Index: AQHTi5sSajMVfq8xMUuQxsxmTSCh5A==
Date: Fri, 12 Jan 2018 12:47:07 +0100
Message-ID: <[email protected]>
Accept-Language: fr-FR, en-US
Content-Language: en-US
X-MS-Exchange-Organization-AuthAs: Internal
X-MS-Exchange-Organization-AuthMechanism: 04
X-MS-Exchange-Organization-AuthSource: EX3.local
X-MS-Has-Attach:
X-MS-Exchange-Organization-Network-Message-Id: 77b61742-2589-475b-1765-08d559b2355c
X-MS-Exchange-Organization-SCL: -1
X-MS-TNEF-Correlator:
X-MS-Exchange-Organization-RecordReviewCfmType: 0
x-mailer: Apple Mail (2.3445.5.20)
Content-Type: text/plain; charset="us-ascii"
Content-ID: <[email protected]>
MIME-Version: 1.0

testing attachement

--------------55D5C868D71CE57DC948563F--
"""
mail = mailparser.parse_from_string(a)
mail.message_as_string
mail.attachments

vmalguy avatar Jan 12 '18 13:01 vmalguy

Maybe because there isn't any attachment in your mail. Can you send me your raw mail?

fedelemantuano avatar Jan 12 '18 13:01 fedelemantuano

hum ? what about : Content-Type: message/rfc822; name="Attached Message" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="Attached Message"

this is not considered to be an attachment ?

vmalguy avatar Jan 12 '18 13:01 vmalguy

It's not an issue. In your mail you have Content-Type: message/rfc822:

The Message/rfc822 (primary) subtype A Content-Type of "message/rfc822" indicates that the body contains an encapsulated message, with the syntax of an RFC 822 message.

mailparser puts the text of forwarded message in body splitted from --- mail_boundary ---:

{
  "body": "test\n--- mail_boundary ---\ntesting attachement",
  "received": [
    {
      "from": "10.42.106.119 10.10.254.33",
      "delay": 0,
      "date_utc": "2018-01-12T11:48:13",
      "hop": 1,
      "date": "Fri, 12 Jan 2018 12:48:13 +0100",
      "with": "Microsoft SMTP Server version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256 id 15.1.1415.2",
      "by": "EX3.local 172.16.2.3"
    },
    {
      "from": "EX3.local 172.16.2.3",
      "delay": 0.0,
      "date_utc": "2018-01-12T11:48:13",
      "hop": 2,
      "date": "Fri, 12 Jan 2018 12:48:13 +0100",
      "with": "Microsoft SMTP Server version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256 id 15.1.1415.2 via Mailbox Transport",
      "by": "EX3.local 172.16.2.3"
    }
  ],
  "from": [
    [
      "Simon",
      "[email protected]"
    ]
  ],
  "to": [
    [
      "Vincent",
      "[email protected]"
    ]
  ],
  "date": "2018-01-12T11:46:42",
  "has_defects": false,
  "subject": "test eml attachment"
}

If you change Content-Type in image/jpeg, you will have:

"attachments": [
   {
     "binary": false,
     "mail_content_type": "image/jpeg",
     "content_transfer_encoding": "7bit",
     "payload": "Received: from ex2.local (172.16.2.2) by EX3.local (172.16.2.3) with\n Microsoft SMTP Server (version=TLS1_2,\n cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1415.2 via Mailbox\n Transport; Fri, 12 Jan 2018 12:47:07 +0100\nReceived: from EX3.local (172.16.2.3) by EX2.local (172.16.2.2) with\n Microsoft SMTP Server (version=TLS1_2,\n cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1415.2; Fri, 12\n Jan 2018 12:47:07 +0100\nReceived: from EX3.local ([fe80::7806:9f98:c0b4:db47]) by EX3.local\n ([fe80::7806:9f98:c0b4:db47%13]) with mapi id 15.01.1415.002; Fri, 12 Jan\n 2018 12:47:07 +0100\nFrom: Vincent  <[email protected]>\nTo: Simon  <[email protected]>\nSubject: testing attachement\nThread-Topic: testing attachement\nThread-Index: AQHTi5sSajMVfq8xMUuQxsxmTSCh5A==\nDate: Fri, 12 Jan 2018 12:47:07 +0100\nMessage-ID: <[email protected]>\nAccept-Language: fr-FR, en-US\nContent-Language: en-US\nX-MS-Exchange-Organization-AuthAs: Internal\nX-MS-Exchange-Organization-AuthMechanism: 04\nX-MS-Exchange-Organization-AuthSource: EX3.local\nX-MS-Has-Attach:\nX-MS-Exchange-Organization-Network-Message-Id: 77b61742-2589-475b-1765-08d559b2355c\nX-MS-Exchange-Organization-SCL: -1\nX-MS-TNEF-Correlator:\nX-MS-Exchange-Organization-RecordReviewCfmType: 0\nx-mailer: Apple Mail (2.3445.5.20)\nContent-Type: text/plain; charset=\"us-ascii\"\nContent-ID: <[email protected]>\nMIME-Version: 1.0\n\ntesting attachement",
     "filename": "Attached Message"
   }
 ],

fedelemantuano avatar Jan 12 '18 14:01 fedelemantuano

ok, I will use mail.message_as_string.find(what_to_find) seems to work thanks

vmalguy avatar Jan 13 '18 08:01 vmalguy

Can this be something optional? We receive forwarded "suspicious" mail from users and I need to keep attached email as attachment. I don't want forwarded data to be mixed with original one and I need to process them by themselves. At the moment I've created a simple fork (back in 3.11, but fortunately easy to be keep synced) but having this as option could be easier to manage in future

dadokkio avatar Mar 01 '21 08:03 dadokkio

Hi @dadokkio, can you submit a PR?

fedelemantuano avatar Mar 01 '21 09:03 fedelemantuano

at the moment there is no option, all rfc822 are stored as attachments. Do you want also the option added to my code or just the pull is ok?

dadokkio avatar Mar 01 '21 09:03 dadokkio

Try to make a PR, so we can work it.

fedelemantuano avatar May 24 '21 09:05 fedelemantuano

pull #89 done. in our use case we obtain attached emails with write_attachments and then start a new parsing process with them.

dadokkio avatar May 24 '21 13:05 dadokkio

I had similar issues with parsing emails that contain multipart attachment (.eml). They were omitted. My fix is in #102

wszostak avatar Nov 29 '21 16:11 wszostak

What is the status on those 2 PR #89 and #102 ?

sgeulette avatar Jun 16 '22 08:06 sgeulette

Hi, sorry for late, but I didn't have any time. I believe to work on in the next weeks.

fedelemantuano avatar Jun 19 '22 19:06 fedelemantuano