newspaper4k
newspaper4k copied to clipboard
Make lxml requirement less restrictive
Since lxml version 5.2.0, lxml.html.clean (required by newspaper) got extracted into a separate library. Using the [html_clean] extra allows for lxml versions >= 5.2.0 (for older versions the extra will be ignored).
Proposed Changes:
Remove upper bound for lxml by adding the [html_clean] extra. This way, newer versions of lxml can be used together with newspaper4k.