turndown icon indicating copy to clipboard operation
turndown copied to clipboard

Line breaks inside format tags

Open q00u opened this issue 5 years ago • 5 comments

HTML source:

<p><strong><em>Strong+Em text starts fine, but line break &lt;br&gt; tags break everything<br><br>

Linebreaks before the closing tags move the closing markdown to a different line.<br><br>

The resulting markdown does not render correctly.</strong></em><br><br>

<strong><em>Addition closing/opening strong+em tags can make things...<br>
</em></strong>

<strong><em>even worse.<br><br>
</em></strong>

So, maybe close out markdown before a &lt;br&gt; tag?</p>

Expected render:

Strong+Em text starts fine, but line break <br> tags break everything

Linebreaks before the closing tags move the closing markdown to a different line.

The resulting markdown does not render correctly.


Addition closing/opening strong+em tags can make things...
even worse.

So, maybe close out markdown before a <br> tag?

Turndown output:

**_Strong+Em text starts fine, but line break <br> tags break everything  
  
Linebreaks before the closing tags move the closing markdown to a different line.  
  
The resulting markdown does not render correctly._**  
  
**_Addition closing/opening strong+em tags can make things...  
_****_even worse.  
  
_**So, maybe close out markdown before a <br> tag?

Resulting render:

**_Strong+Em text starts fine, but line break
tags break everything

Linebreaks before the closing tags move the closing markdown to a different line.

The resulting markdown does not render correctly._**

**_Addition closing/opening strong+em tags can make things...
_****_even worse.

_**So, maybe close out markdown before a
tag?

q00u avatar Aug 26 '18 04:08 q00u

Thanks. This is a duplicate of #123

domchristie avatar Aug 26 '18 10:08 domchristie

Just noticed that's a PR, so will reopen this issue.

domchristie avatar Aug 26 '18 10:08 domchristie

Another possible solution would be to preserve <br> elements when contained in an inline element. It seems valid: https://spec.commonmark.org/dingus/?text=hello%3Cbr%3Eworld

domchristie avatar Aug 26 '18 10:08 domchristie

I'm also struggling with this issue while using turndown with medium editor. Usually it is not really an issue. But in combination with gfm and tables, it can mess up the entire layout

underdoeg avatar Aug 28 '18 19:08 underdoeg

In case it's helpful to someone, I'm fixing this in a project of mine by using cheerio to strip the br tags from the element, and then wrapping the whole thing in a p tag before parsing it with turndown. It's older syntax but it's working for my purposes.

const cheerio = require("cheerio");
const $ = cheerio.load(content);
const boldWithBr = $("strong").toArray();
boldWithBr.map(ele => {
    if ($(ele).find("br").length) {
        $(ele).wrap($("<p></p>"));
    }
    $(ele)
        .find("br")
        .remove();
    return ele;
});

const emWithBr = $("em").toArray();
emWithBr.map(ele => {
    if ($(ele).find("br").length) {
        $(ele).wrap($("<p></p>"));
    }
    $(ele)
        .find("br")
        .remove();
    return ele;
});

alana314 avatar Feb 10 '20 22:02 alana314