mailbagit icon indicating copy to clipboard operation
mailbagit copied to clipboard

Not handing email messages as attachments

Open gwiedeman opened this issue 2 years ago • 2 comments

Describe the bug Currently both the MSG and PST parsers barf if messages have other messages as attachments. Not yet sure how the EML and MBOX parsers are handling this. This could be tricky, as theoretically, message attachments could be infinitely recursive, right?

gwiedeman avatar Apr 28 '22 21:04 gwiedeman

This also causes an error when writing to WARC derivatives, since this module expects a binary when adding attachments to WARC files. The WARC is still created with a body, but the error stops mailbagit from writing additional attachments to the WARC. So if there are 4 attachments, with the third being an attached email, the first two will be written to the WARC, but the 4th will not.

ERROR: No filename found for attachment, integer will be used instead.
**************************************************************************
ERROR: Error adding attachments to WARC derivative: TypeError('quote_from_bytes() expected bytes')
**************************************************************************
Traceback (most recent call last):
  File "c:\users\gw234478\projects\mailbagit\mailbagit\derivatives\warc.py", line 172, in do_task_per_message
    f"{warc_uri}/{quote_plus(attachment.Name)}",
  File "C:\Users\gw234478\AppData\Local\Programs\Python\Python39\lib\urllib\parse.py", line 887, in quote_plus
    string = quote(string, safe + space, encoding, errors)
  File "C:\Users\gw234478\AppData\Local\Programs\Python\Python39\lib\urllib\parse.py", line 871, in quote
    return quote_from_bytes(string, safe)
  File "C:\Users\gw234478\AppData\Local\Programs\Python\Python39\lib\urllib\parse.py", line 896, in quote_from_bytes
    raise TypeError("quote_from_bytes() expected bytes")
TypeError: quote_from_bytes() expected bytes

gwiedeman avatar Jun 15 '22 16:06 gwiedeman

blocked by #198

gwiedeman avatar Jul 05 '22 21:07 gwiedeman