enmime
enmime copied to clipboard
Should enmime detect HTML format and convert to text in this case?
Dear all,
I recently got an email in such format, main content-type is multipart/mixed
, one of the part in body is text/plain with Content-Transfer-Encoding: quoted-printable
, it contains HTML code and base64 encoded image. enmime doesn't convert html to text, and the encoded image is included in result too.
Question: should enmime detect HTML format and convert it to text in such case?
MIME-Version: 1.0
... [omit other normal email headers here] ...
Content-Type: multipart/mixed;
boundary="----=_Part_148993_809028477.1658014818211"
------=_Part_148992_763979638.1658014818211
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset=UTF-8
<div style=3D"caret-color: rgba(0, 0, 0, 0.847); color: rg=
ba(0, 0, 0, 0.847); font-size: 12px;"><img src=3D"data:image/jpeg;base64,/9=
j/4AAQSkZJRgABAQAASABIAAD/4QBYRXhpZgAATU0AKgAAAAgAAgESAAMAAAABAAEAAIdpAAQAA=
... [omit long base64 lines] ...
...>...```
Currently enmime trusts the Content-Type when it comes to text/plain vs text/html, so this is working as expected. I'm not necessarily opposed to trying to detect and fix the content type, although it should be guarded behind an option. I suspect spammers/phishers may try to bypass filters with this method, so it could pose some danger to the end user.
hi @jhillyerd
Thanks for the reply.
Since email is too easy to forge, it might be reasonable to do some more work beyond RFC as a mail "client" library. :)