tnefparse icon indicating copy to clipboard operation
tnefparse copied to clipboard

UnicodeDecodeError: 'gbk' codec can't decode byte 0xba in position 14: illegal multibyte sequence

Open aniude opened this issue 3 years ago • 2 comments

I run tnefparse in command line with debug mode: tnefparse winmail.dat -l DEBUG

INFO:tnefparse:Skipping checksum for performance
DEBUG:tnefparse:Attribute type: 0x001e
DEBUG:tnefparse:Attribute name: 0x1008 (MAPI_RTF_SYNC_BODY_TAG)
ERROR:tnefparse:decode_mapi exception: 'gbk' codec can't decode byte 0xba in position 14: illegal multibyte sequence
DEBUG:tnefparse:exception details:
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/tnefparse/mapi.py", line 99, in decode_mapi
    attr_data, offset = parse_property(data, offset, attr_name, attr_type, codepage, num_mv_properties)
  File "/usr/local/lib/python3.9/site-packages/tnefparse/mapi.py", line 155, in parse_property
    item = item.decode(codepage)
UnicodeDecodeError: 'gbk' codec can't decode byte 0xba in position 14: illegal multibyte sequence

It seems like the encoding charset is wrong, is that anyway to set the charset as parameter?

aniude avatar Feb 17 '22 02:02 aniude

You can have different codecs nested in different parts of the document. We could probably have a universal override but that probably cause other problems.

What about returning the raw bytes when rather than an exception in the case of decoding error?

jrideout avatar Mar 02 '22 21:03 jrideout

@aniude are you able to share an example tnef that generates this error?

jrideout avatar Mar 02 '22 21:03 jrideout