html-to-markdown icon indicating copy to clipboard operation
html-to-markdown copied to clipboard

Line breaks inside tag

Open multiwebinc opened this issue 3 years ago • 3 comments

Version(s) affected

5.0.2

Description

Line breaks inside tags produce incorrect markdown

How to reproduce

HTML:

<b>Hello<br><br>World</b>

Output:

**Hello  
  
world**

Expected output:

**Hello**
  
**world**

multiwebinc avatar Nov 18 '21 23:11 multiwebinc

This is an interesting case that could have three possible desired outputs based on one's philosophy of how this library should work.

You've already illustrated one case, where you expect the library to produce Markdown that, if converted back to HTML, produces results that are visually similar to users but with different HTML:

<p><strong>Hello</strong></p>
<p><strong>World</strong></p>

Another philosophy would be that this library should strive to produce Markdown like this:

**Hello<br><br>World**

Which converts back into virtually-identical HTML like this:

<p><strong>Hello<br><br>World</strong</p>

(That is the approach that I personally prefer)

Lastly, there's a third philosophy that's kind of a hybrid of the two which would give Markdown like this:

**Hello\
\
World**

This produces:

<p><strong>Hello<br />
<br />
world</strong></p>

But I don't think that's something anyone would really want or expect :)

Regardless, I agree that this is a bug.

colinodell avatar Nov 19 '21 15:11 colinodell

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Mar 02 '22 23:03 stale[bot]

This should probably be reopened.

multiwebinc avatar Apr 18 '22 23:04 multiwebinc