newspaper4k icon indicating copy to clipboard operation
newspaper4k copied to clipboard

newspaper cannot parse or encode error

Open AndyTheFactory opened this issue 2 years ago β€’ 0 comments

Issue by Kyeongpil Fri Feb 10 11:29:29 2017 Originally opened as https://github.com/codelucas/newspaper/issues/331


Hi, I am trying to crawl Korean news papers.

Most news articles crawled and parsed well but some articles such as below url, newspaper cannot parse well in Korean. http://www.edaily.co.kr/news/newspath.asp?newsid=01610486605953456

After parsing the article, the result is as below the picture. (Parsed title should be "κΈˆκ°μ›, μ†ŒλΉ„μžλ‹¨μ²΄μ™€ κΈˆμœ΅ν˜„μž₯μ„œ 민원 μƒλ‹΄Β·κ΅¬μ œ") 2017-02-10 8 26 33

As shown in the a.title, I think newspaper have an encoding problem.

What do you think of this problem?

AndyTheFactory avatar Oct 24 '23 10:10 AndyTheFactory