python-readability
python-readability copied to clipboard
Orphan links in doc.summary()
Hi,
a user run into this bug: https://github.com/adbar/trafilatura/issues/21
There are links which end up being orphans between paragraphs, which messes up text rendering and conversion. The problem comes from the output of readability-lxml:
<p>Среди жанров многопользовательских игр MMOFPS занимают одну из лидирующих позиций, наряду с </p><a href="https://gametarget.ru/mmorpg/">MMORPG</a><p> и </p><a href="https://gametarget.ru/feature/moba/">MOBA</a><p>. (https://gametarget.ru/mmofps/)
Could you please have a look at it?