结果有乱码

Open xxllp opened this issue 9 years ago • 5 comments

如题，换了个网页直接打印结果乱码

Sep 27 '16 07:09 xxllp

@xxllp 网址？

Sep 29 '16 07:09 rainyear

ext = Extractor(url="http://www.ahgd.gov.cn/web_content.php?id=14971",blockSize=5, image=False) print(ext.getContext())

Oct 08 '16 08:10 xxllp

确实有乱码，我改用了BeautifulSoup+html5lib 解析网页

Oct 18 '16 10:10 klzsysy

其实我在想为什么输出结果不仅没换行，连空格都没有

Jan 18 '17 03:01 yingshaoxo

@klzsysy resp.encoding 指定为网页的encoding ,默认是UTF-8输出的，如果你的页面不是UTF-8肯定乱码了。

Apr 14 '17 07:04 ljhzds