html2csv
html2csv copied to clipboard
Let beautifulsoup guess the input codec
Unlike pathlib, BeautifulSoup can guess and handle several text codecs so we let it work its magic
Addresses issue #5
Any change to get this merged? This PR solved my problem reading a non utf-8 input
Traceback (most recent call last):
File "/opt/homebrew/bin/html2csv", line 8, in <module>
sys.exit(main())
^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/html2csv/__main__.py", line 41, in main
html_doc = path.read_text()
^^^^^^^^^^^^^^^^
File "/opt/homebrew/Cellar/[email protected]/3.11.3/Frameworks/Python.framework/Versions/3.11/lib/python3.11/pathlib.py", line 1059, in read_text
return f.read()
^^^^^^^^
File "<frozen codecs>", line 322, in decode
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf3 in position 532: invalid continuation byte