html2text icon indicating copy to clipboard operation
html2text copied to clipboard

Featurerequest: Output without markdown

Open sowinski opened this issue 2 years ago • 3 comments

Hi,

I found this library becaue I want to to html => text. Unfortunately the library is doing html => markdown.

I haven't seen anything in the docs. Is it possible to disable the markdown output and just get plan text?

Regards Philipp

sowinski avatar Mar 25 '22 13:03 sowinski

Same here. Would be great :)

Cabu avatar Aug 09 '22 18:08 Cabu

Several solutions exist and can be found by searching for markdown2text etc in https://pypi.org/ Perhaps this issue is therefore out of scope (but that is not up to me to decide).

PanderMusubi avatar Aug 24 '23 20:08 PanderMusubi

I was about to write what @sowinski wrote.

A html2text then a markdown2text would not get the job done because the markdown regex remove some useless bytes, escapes characters ... Example

 -      foo
 -      bar
 -      baz

Will be converted into

\- foo
\- bar
\- baz

Which loses the initial number of spaces. Any more processing steps, as I feel like @PanderMusubi suggests, cannot regenerate the information that was lost (the spaces).

lesnake avatar Jan 10 '24 14:01 lesnake