email2matrix icon indicating copy to clipboard operation
email2matrix copied to clipboard

html message line breaks are not rendered

Open maralorn opened this issue 5 years ago • 5 comments

When using email2matrix with SkipMarkdown = false I get matrix messages with the formatted_body field containing html, but \n line breaks. At least riot doesn’t render those. Is it perhaps necessary to replace them with <br> tags? Or does this mean the email (from prometheus in my example) is not standard conform?

maralorn avatar Aug 12 '19 01:08 maralorn

# Subject\n\nBody seems to yield formatted_body of:

<h1>Subject</h1>\n\n<p>Body</p>

You'd like the HTML in formatted_body to not have \n\n in it at all, but be a single line of HTML tags?

I think the above is pretty natural and also how a human would write the HTML. Humans typically don't put all their HTML tags on the same line. Whitespace (new lines) like this don't cause issues.

spantaleev avatar Aug 12 '19 03:08 spantaleev

Thanks for the reply.

No the both newlines at the beginning of the formatted_body template are no problem at all.

The problem is that I got an html-formatted email which got simplified by email2matrix as the formatted body. In the <p>Body</p> part it now contains \n which are not rendered (by riot-web at least) but should. e.g. I got an email containing a table which got simplified to all rows being in one line. When I set SkipMarkdown = true I get working linebreaks in the message. (Because in the body riot renders \n as linebreaks.)

maralorn avatar Aug 12 '19 14:08 maralorn

e.g. I got an email containing a table which got simplified to all rows being in one line.

Ahh.. this is because of the way we do things.

Email2Matrix only deals with the plain-text part of the email message. We use the enmime library to help us with that.

As per its documentation:

The Envelope contains both the plain text and HTML portions of the email. If there was no plain text Part available, the HTML Part will be down-converted using the html2text library[1]. The root of the Part tree, as well as slices of the inline and attachment Parts are also available.

I imagine that your email has no plain-text part, which causes its HTML to be down-converted to plain-text. Which probably yields an ugly "table" in any case.

Depending on SkipMarkdown, different things may happen:

  • if SkipMarkdown = true, we just send that plain-text body pretty much as-is, leaving riot-web (or whatever client you use) to interpret it as it wishes - "\n being a line-break
  • if SkipMarkdown = false, we pass the message through the blackfriday library, which likely does something weird for you

In any case, this whole thing makes me wonder whether we should have an option for simply relaying the HTML part of an email as-is. Right now we're pretty insistent on turning it into plain-text. I'm not sure how useful that is though.. Perhaps you (or others) could share some feedback on what kind of stuff you'd like to send. In the end, relaying HTML as-is may not work great either, as riot-web (or other clients) may choke on more complicated parts.

spantaleev avatar Aug 13 '19 06:08 spantaleev

Yep, that’s the problem. I must admit I am not entirely sure if the problem is in enmime or blackfriday.

I have no idea which portion of HTML riot is willing to render. But obviously actually passing the table through might look much nicer.

maralorn avatar Aug 13 '19 09:08 maralorn

To elaborate on that: If enmime makes just one linebreak where there is actually supposed to be a rendered linebreak that will look fine in the plaintext version. But a markdown renderer as blackfriday will assume that two lines without a blank line in between are in the same paragraph and remove the newline. So using first a html2text library and then treating the text as markdown might be the wrong approach.

maralorn avatar Aug 13 '19 09:08 maralorn