python-readability icon indicating copy to clipboard operation
python-readability copied to clipboard

Orphan links in doc.summary()

Open adbar opened this issue 5 years ago • 0 comments

Hi,

a user run into this bug: https://github.com/adbar/trafilatura/issues/21 There are links which end up being orphans between paragraphs, which messes up text rendering and conversion. The problem comes from the output of readability-lxml:

<p>Среди жанров многопользовательских игр MMOFPS занимают одну из лидирующих позиций, наряду с </p><a href="https://gametarget.ru/mmorpg/">MMORPG</a><p> и </p><a href="https://gametarget.ru/feature/moba/">MOBA</a><p>. (https://gametarget.ru/mmofps/)

Could you please have a look at it?

adbar avatar Oct 05 '20 13:10 adbar