turndown
turndown copied to clipboard
Line breaks inside format tags
HTML source:
<p><strong><em>Strong+Em text starts fine, but line break <br> tags break everything<br><br>
Linebreaks before the closing tags move the closing markdown to a different line.<br><br>
The resulting markdown does not render correctly.</strong></em><br><br>
<strong><em>Addition closing/opening strong+em tags can make things...<br>
</em></strong>
<strong><em>even worse.<br><br>
</em></strong>
So, maybe close out markdown before a <br> tag?</p>
Expected render:
Strong+Em text starts fine, but line break <br> tags break everything
Linebreaks before the closing tags move the closing markdown to a different line.
The resulting markdown does not render correctly.
Addition closing/opening strong+em tags can make things...
even worse.
So, maybe close out markdown before a <br> tag?
Turndown output:
**_Strong+Em text starts fine, but line break <br> tags break everything
Linebreaks before the closing tags move the closing markdown to a different line.
The resulting markdown does not render correctly._**
**_Addition closing/opening strong+em tags can make things...
_****_even worse.
_**So, maybe close out markdown before a <br> tag?
Resulting render:
**_Strong+Em text starts fine, but line break
tags break everythingLinebreaks before the closing tags move the closing markdown to a different line.
The resulting markdown does not render correctly._**
**_Addition closing/opening strong+em tags can make things...
_****_even worse._**So, maybe close out markdown before a
tag?
Thanks. This is a duplicate of #123
Just noticed that's a PR, so will reopen this issue.
Another possible solution would be to preserve <br>
elements when contained in an inline element. It seems valid: https://spec.commonmark.org/dingus/?text=hello%3Cbr%3Eworld
I'm also struggling with this issue while using turndown with medium editor. Usually it is not really an issue. But in combination with gfm and tables, it can mess up the entire layout
In case it's helpful to someone, I'm fixing this in a project of mine by using cheerio to strip the br tags from the element, and then wrapping the whole thing in a p tag before parsing it with turndown. It's older syntax but it's working for my purposes.
const cheerio = require("cheerio");
const $ = cheerio.load(content);
const boldWithBr = $("strong").toArray();
boldWithBr.map(ele => {
if ($(ele).find("br").length) {
$(ele).wrap($("<p></p>"));
}
$(ele)
.find("br")
.remove();
return ele;
});
const emWithBr = $("em").toArray();
emWithBr.map(ele => {
if ($(ele).find("br").length) {
$(ele).wrap($("<p></p>"));
}
$(ele)
.find("br")
.remove();
return ele;
});