html2text
html2text copied to clipboard
Featurerequest: Output without markdown
Hi,
I found this library becaue I want to to html => text. Unfortunately the library is doing html => markdown.
I haven't seen anything in the docs. Is it possible to disable the markdown output and just get plan text?
Regards Philipp
Same here. Would be great :)
Several solutions exist and can be found by searching for markdown2text etc in https://pypi.org/ Perhaps this issue is therefore out of scope (but that is not up to me to decide).
I was about to write what @sowinski wrote.
A html2text then a markdown2text would not get the job done because the markdown regex remove some useless bytes, escapes characters ... Example
- foo
- bar
- baz
Will be converted into
\- foo
\- bar
\- baz
Which loses the initial number of spaces. Any more processing steps, as I feel like @PanderMusubi suggests, cannot regenerate the information that was lost (the spaces).