html2text icon indicating copy to clipboard operation
html2text copied to clipboard

inline tags within strong/strike/u/i/em/... do not handle spaces correctly

Open mrh1997 opened this issue 2 years ago • 0 comments

Some inline tags require removing space between text and markdown (<strong>TEXT </string> => **TEXT**)

When adding another inline tag within these tags space removal does not work proper any more:

from html2text import html2text
print(html2text("<b><a>X</a></b>")))   # returns "**X**" => OK
print(html2text("<b>X </b>")))         # returns "**X**" => OK
print(html2text("<b><a>X</a> </b>")))  # returns "**X **" => INVALID  (expected "**X**")
print(html2text("<b>X <a>Y</a></b>")))  # returns "**XY**" => INVALID (expected "**X Y**")
  • Version by html2text --version: 2020.1.16
  • Python version python --version: 3.6.3

mrh1997 avatar Oct 12 '21 16:10 mrh1997