outlook-message-parser icon indicating copy to clipboard operation
outlook-message-parser copied to clipboard

Incorrect MimeType application/excel

Open programmiererei opened this issue 1 year ago • 5 comments

If an e-mail contains an XLSX Excel file as an attachment, this is returned with the mimetype “application/excel”. However, the correct mimetype would be “application/vnd.openxmlformats-officedocument.spreadsheetml.sheet”. “application/excel” is also not a valid mimetype: see https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types/Common_types

I use omsg.fetchTrueAttachments(), and the attachment object contains: image

It is strange that the longFilename is correct, but the filename has the wrong extension 'xls'.

programmiererei avatar Sep 24 '24 11:09 programmiererei

Hhm, I'll have to dive into this. There is a mimetype mapping in the project, so it's probably mapped wrong.

bbottema avatar Sep 24 '24 14:09 bbottema

Thank you for your answer. I could not find the problem in your code. I have programmed a workaround in our code, simple search and replace.

Best regards,

Ralf

Von: Benny Bottema @.> Gesendet: Dienstag, 24. September 2024 16:42 An: bbottema/outlook-message-parser @.> Cc: programmiererei @.>; Author @.> Betreff: Re: [bbottema/outlook-message-parser] Incorrect MimeType application/excel (Issue #83)

Hhm, I'll have to dive into this. There is a mimetype mapping in the project, so it's probably mapped wrong.

— Reply to this email directly, view it on GitHub https://github.com/bbottema/outlook-message-parser/issues/83#issuecomment-2371512491 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AII5NPU7VZ6XTAT674RTHP3ZYF24XAVCNFSM6AAAAABOYBBMBCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNZRGUYTENBZGE . You are receiving this because you authored the thread. https://github.com/notifications/beacon/AII5NPTUSE3W5BVIGZ6BOV3ZYF24XA5CNFSM6AAAAABOYBBMBCWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTUNLJUKW.gif Message ID: @.*** @.***> >

programmiererei avatar Sep 24 '24 18:09 programmiererei

I think the problem is the old 8.3 filename in the filename. property, because xlsx is abbreviated to xls. And in OutlookFileAttachment.java, procedure checkMimeTag is checked first filename and then longFilename.

programmiererei avatar Sep 25 '24 12:09 programmiererei

I'm not sure why xlsx would be abbreviated to xls (I don't think this library itself does that), but the extension is mapped to a mimetype as follows:

application/x-msexcel xlw xla xls application/x-excel xld xlt xlw xlv xlk xlm xll xla xlc xls xlb application/vnd.ms-excel xlsx xlw xlsm xlm xll xlc xls XLS xlb application/excel xld xlt xlw xlv xl xlk xlm xll xla xlc xls xlb

Looking deeper into Angus' Activation package and seeing the mapping to be based on a hashtable, I can only assume the above key value combinations are inverted (extensions as keys) in top down order, meaning the last mimetype to declare the same extension wins. This would explain the mimetype you saw for the extension "xls". Why this doesn't include extensions longer than 3 characters I have no idea, but apparently that's not an issue seeing you mentioning the abbreviation to make a mapping happening somehow.

I suggest we examine the order of mimetypes, deduplicate extensions and apply the correct mimetypes. Care to propose one?

Here's my proposal:

application/vnd.openxmlformats-officedocument.spreadsheetml.sheet xlsx application/vnd.ms-excel xls xlk xl application/vnd.ms-excel.sheet.macroenabled.12 xlsm application/vnd.ms-excel.template.macroenabled.12 xltm application/vnd.ms-excel.addin.macroenabled.12 xlam

bbottema avatar Sep 25 '24 19:09 bbottema

Also, it would be helpful if you can provide a .msg file with a problematic excel sheet, so I can include it in my junit testing.

bbottema avatar Sep 25 '24 19:09 bbottema