mailparse
mailparse copied to clipboard
base64 decoding should not be "strict"
I've encountered yet another mail where the base64 decoding using mailparse (with and without my recent change in #95) fails.
The reduced sample looks like this:
PC9odG1sPn==
If you decode that string with a non-strict (as in not strictly requiring "normalized" base64) it will result in:
</html>
If you reencode that string with Python/Ruby/base64 on the CLI you'll get
PC9odG1sPg==
Which then decodes properly with the mailparse crate.
The way I am currently working around this (with the data_encoding
crate) is by definining my own BASE64 decoder:
lazy_static! {
static ref BASE64_DECODER : data_encoding::Encoding = {
let mut spec = data_encoding::BASE64_MIME.specification();
spec.check_trailing_bits = false; // <- the important bit
spec.encoding().expect("The encoding must be valid")
};
}
I've come to believe that parsing mail with "strict" base64 parsers is just not a good idea. It might work in an ideal world but sadly I've received tons of mails with edge cases over the years :(
My ask for this issue is that we should probably switch to a non-strict decoder for mails. This is perhaps something that is better suited as part of the data_encoding
library instead?