imbox
                                
                                
                                
                                    imbox copied to clipboard
                            
                            
                            
                        message.get_payload(decode=True) is removing some special characters
In parser.py, line 125 content = message.get_payload(decode=True) is removing some special characters like ç or é or... It works fine with message.get_payload(decode=False) like this :
    content = message.get_payload(decode=False)
    charset = message.get_content_charset('utf-8')
    try:
        return content.decode(charset, 'ignore')
    except LookupError:
        return content.decode(charset.replace("-", ""), 'ignore')
    except AttributeError:
        return content```
Do you want a pull request ? 
Or another solution ? 
                                    
                                    
                                    
                                
What encoding has that e-mail you are having problems with? (is it utf-8?)
That code comes from:
https://github.com/martinrusev/imbox/commit/ba913fe31dd6146f9500583916d4332edce1c481 https://github.com/martinrusev/imbox/pull/78
Yes I was receiving an email in UTF-8. The email was send by another server (woocommerce).
I have the same problem.
In my e-mail it says charset=utf-8, while there are actually latin-1 characters in it.
Example:
b'\xe4\xf6\xfc\xc4\xd6\xdc\xdf'
that should translate to this:
'äöüÄÖÜß'
Imbox reads from the raw body the charset=utf-8 info and uses this to decode the text, which leads to loss of the latin-1 characters.
As a hack, I changed line 129 in parser.py to following code:
    latinchars = [b'\xe4', b'\xf6', b'\xfc', b'\xc4', b'\xd6', b'\xdc', b'\xdf']
    if any(s in content for s in latinchars):
        charset='latin-1'
    else:
        charset = message.get_content_charset('utf-8')
Other characters can be found here or in python with 'ä'.encode('latin-1')
Edit:
To just set message.get_payload(decode=False) will lead to problems if the e-mail is actually encoded with utf-8
Another edit:
At my computer, Thunderbird sends latin-1 characters while setting the charset=utf-8.