haraka-plugin-mongodb Incorrect handling of `body_text_encoded` results in lots of NULL chars

As part of https://github.com/haraka/Haraka/pull/2737 which was included in Haraka since 2.8.26, there was an unexpected change of the type of Body.body_text_encoded from string to Buffer. Moreover, the new Buffer is of fixed size and requires Buffer.slice() to get actual contents.

Of course, it is a question to Haraka developers about reasons to change types in patch-level release, but it is possible that they're treating Body.body_text_encoded as an internal property.

Anyway, this change caused getBodyOfTypeFromChildren function to return a 64kb string mostly filled with \u0000 chars in a case if body_text_encoded was selected as a source.

Jun 02 '22 19:06 FlyingDR

Hi,

We already fixed this, but as we are processing over 500,000/emails/day we are always holding back before making releases so we can test everything properly.

That said, please test with the latest develop branch as this contains the one we have been using / fixed since the latest release. Let me know if that still fails. Thank you.

Jun 02 '22 19:06 thenitai

Hello,

Thank you for fast response. I've tested with current HEAD of the master (b2b8668576548d42a0decaca9b7304bf3e2b8b1f) and develop (d2057cc7df636c485aca8c015957142b6b2cdba8) branches and issue is still there.

Here are the results of processing a very simple email, it is the contents of the _email variable at this point of code. Added .txt extension because GitHub does not allow adding attachments in JSON.

test-d2057cc7.json.txt - version from d2057cc7df636c485aca8c015957142b6b2cdba8
test-fixed.json.txt - version from provided pull request

The difference is pretty obvious.

Jun 02 '22 19:06 FlyingDR