Incorrect MimeType application/excel
If an e-mail contains an XLSX Excel file as an attachment, this is returned with the mimetype “application/excel”. However, the correct mimetype would be “application/vnd.openxmlformats-officedocument.spreadsheetml.sheet”. “application/excel” is also not a valid mimetype: see https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types/Common_types
I use omsg.fetchTrueAttachments(), and the attachment object contains:
It is strange that the longFilename is correct, but the filename has the wrong extension 'xls'.
Hhm, I'll have to dive into this. There is a mimetype mapping in the project, so it's probably mapped wrong.
Thank you for your answer. I could not find the problem in your code. I have programmed a workaround in our code, simple search and replace.
Best regards,
Ralf
Von: Benny Bottema @.> Gesendet: Dienstag, 24. September 2024 16:42 An: bbottema/outlook-message-parser @.> Cc: programmiererei @.>; Author @.> Betreff: Re: [bbottema/outlook-message-parser] Incorrect MimeType application/excel (Issue #83)
Hhm, I'll have to dive into this. There is a mimetype mapping in the project, so it's probably mapped wrong.
— Reply to this email directly, view it on GitHub https://github.com/bbottema/outlook-message-parser/issues/83#issuecomment-2371512491 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AII5NPU7VZ6XTAT674RTHP3ZYF24XAVCNFSM6AAAAABOYBBMBCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNZRGUYTENBZGE . You are receiving this because you authored the thread. https://github.com/notifications/beacon/AII5NPTUSE3W5BVIGZ6BOV3ZYF24XA5CNFSM6AAAAABOYBBMBCWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTUNLJUKW.gif Message ID: @.*** @.***> >
I think the problem is the old 8.3 filename in the filename. property, because xlsx is abbreviated to xls. And in OutlookFileAttachment.java, procedure checkMimeTag is checked first filename and then longFilename.
I'm not sure why xlsx would be abbreviated to xls (I don't think this library itself does that), but the extension is mapped to a mimetype as follows:
application/x-msexcel xlw xla xls application/x-excel xld xlt xlw xlv xlk xlm xll xla xlc xls xlb application/vnd.ms-excel xlsx xlw xlsm xlm xll xlc xls XLS xlb application/excel xld xlt xlw xlv xl xlk xlm xll xla xlc xls xlb
Looking deeper into Angus' Activation package and seeing the mapping to be based on a hashtable, I can only assume the above key value combinations are inverted (extensions as keys) in top down order, meaning the last mimetype to declare the same extension wins. This would explain the mimetype you saw for the extension "xls". Why this doesn't include extensions longer than 3 characters I have no idea, but apparently that's not an issue seeing you mentioning the abbreviation to make a mapping happening somehow.
I suggest we examine the order of mimetypes, deduplicate extensions and apply the correct mimetypes. Care to propose one?
Here's my proposal:
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet xlsx application/vnd.ms-excel xls xlk xl application/vnd.ms-excel.sheet.macroenabled.12 xlsm application/vnd.ms-excel.template.macroenabled.12 xltm application/vnd.ms-excel.addin.macroenabled.12 xlam
Also, it would be helpful if you can provide a .msg file with a problematic excel sheet, so I can include it in my junit testing.