webpage2html icon indicating copy to clipboard operation
webpage2html copied to clipboard

quotation marks (") or (') get converted to � symbol

Open honmashinsei opened this issue 5 years ago • 1 comments
trafficstars

Hi!

When I try to download the following website:

https://zerohplovecraft.wordpress.com/2020/01/12/nursery-rhyme-for-techno-industrial-society/

the quotation marks " and ' get converted to � .

When I try to download the following site:

https://zerohplovecraft.wordpress.com/2018/05/11/the-gig-economy-2/

I get a UnicodeEncodeError: Traceback (most recent call last): File "C:\env\Scripts\webpage2html-script.py", line 11, in <module> load_entry_point('webpage2html==0.3.6', 'console_scripts', 'webpage2html')() File "c:\env\lib\site-packages\webpage2html.py", line 390, in main sys.stdout.write(rs) File "C:\Python38\lib\encodings\cp1252.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\u1e25' in position 3300633: character maps to <undefined>

Can you please take a look at what might be causing this?

honmashinsei avatar Sep 23 '20 00:09 honmashinsei

Both seems to be encoding related issue. I just tried download the first under my environment (Linux computer with UTF-8 as default encoding) and everything seems to be working.

My suggestion is that maybe you could use another environment or try debugging the encoding issue to find out why.

zTrix avatar Sep 27 '20 05:09 zTrix