unstructured icon indicating copy to clipboard operation
unstructured copied to clipboard

bug/partition_signed_emails

Open anotherthomas opened this issue 10 months ago • 0 comments

testcase.txt

Describe the bug I'm trying to partition emails. In some cases, the processing results in a KeyError: 'multipart/mixed'

>>> from unstructured.partition.email import partition_email
>>> partition_email("testcase.txt")

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/app/unstructured/partition/email.py", line 73, in partition_email
    return list(_EmailPartitioner.iter_elements(ctx=ctx))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/unstructured/partition/email.py", line 333, in _iter_elements
    yield from _AttachmentPartitioner.iter_elements(attachment, self._ctx)
  File "/app/unstructured/partition/email.py", line 388, in _iter_elements
    file = io.BytesIO(self._file_bytes)
                      ^^^^^^^^^^^^^^^^
  File "/app/unstructured/utils.py", line 154, in __get__
    value = self._fget(obj)
            ^^^^^^^^^^^^^^^
  File "/app/unstructured/partition/email.py", line 423, in _file_bytes
    content = self._attachment.get_content()
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/email/message.py", line 1124, in get_content
    return content_manager.get_content(self, *args, **kw)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/email/contentmanager.py", line 25, in get_content
    raise KeyError(content_type)
KeyError: 'multipart/mixed'

To Reproduce The easiest way to reproduce the behavior for me is to try to partition a PGP signed but not encrypted email. I have attached an anonymized example.

Expected behavior I expect the email to be parsed and partitioned into its parts.

Environment Info docker image downloads.unstructured.io/unstructured-io/unstructured:0.16.20, also :0.15.14 and :latest

Additional context Add any other context about the problem here.

anotherthomas avatar Feb 14 '25 16:02 anotherthomas