newspaper4k
newspaper4k copied to clipboard
[SITES] www.example.com
python -m newspaper --url="https://edition.cnn.com/2023/11/17/success/job-seekers-use-ai/index.html" --language=en --output-format=json --output-file=article.json Traceback (most recent call last): File "<frozen runpy>", line 189, in _run_module_as_main File "<frozen runpy>", line 148, in _get_module_details File "<frozen runpy>", line 112, in _get_module_details File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/newspaper/__init__.py", line 17, in <module> from .api import ( File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/newspaper/api.py", line 10, in <module> import newspaper.parsers as parsers File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/newspaper/parsers.py", line 18, in <module> import lxml.html.clean File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/lxml/html/clean.py", line 18, in <module> raise ImportError( ImportError: lxml.html.clean module is now a separate project lxml_html_clean. Install lxml[html_clean] or lxml_html_clean directly.
getting this issue on python 3.11.8
i think this is related to #639 The pull request by @changchiyou should fix this