mailbagit
mailbagit copied to clipboard
Not handing email messages as attachments
Describe the bug Currently both the MSG and PST parsers barf if messages have other messages as attachments. Not yet sure how the EML and MBOX parsers are handling this. This could be tricky, as theoretically, message attachments could be infinitely recursive, right?
This also causes an error when writing to WARC derivatives, since this module expects a binary when adding attachments to WARC files. The WARC is still created with a body, but the error stops mailbagit
from writing additional attachments to the WARC. So if there are 4 attachments, with the third being an attached email, the first two will be written to the WARC, but the 4th will not.
ERROR: No filename found for attachment, integer will be used instead.
**************************************************************************
ERROR: Error adding attachments to WARC derivative: TypeError('quote_from_bytes() expected bytes')
**************************************************************************
Traceback (most recent call last):
File "c:\users\gw234478\projects\mailbagit\mailbagit\derivatives\warc.py", line 172, in do_task_per_message
f"{warc_uri}/{quote_plus(attachment.Name)}",
File "C:\Users\gw234478\AppData\Local\Programs\Python\Python39\lib\urllib\parse.py", line 887, in quote_plus
string = quote(string, safe + space, encoding, errors)
File "C:\Users\gw234478\AppData\Local\Programs\Python\Python39\lib\urllib\parse.py", line 871, in quote
return quote_from_bytes(string, safe)
File "C:\Users\gw234478\AppData\Local\Programs\Python\Python39\lib\urllib\parse.py", line 896, in quote_from_bytes
raise TypeError("quote_from_bytes() expected bytes")
TypeError: quote_from_bytes() expected bytes
blocked by #198