python-seo-analyzer icon indicating copy to clipboard operation
python-seo-analyzer copied to clipboard

utf-8' codec can't decode bytes in position 31608-31609

Open Uziel9999 opened this issue 4 years ago • 0 comments

I am using Jupyter notebook to run the script. I used the example from this site, but with an actual company website. This is on windows 10 using the latest version of Anaconda.

What am I doing incorrectly?

Input: from seoanalyzer import analyze site = 'http://www.site.com' sitemap = None output = analyze(site, sitemap) print(output)

Results:

UnicodeDecodeError Traceback (most recent call last) in 4 sitemap = None 5 ----> 6 output = analyze(site, sitemap) 7 print(output)

C:\ProgramData\Anaconda3\lib\site-packages\seoanalyzer\analyzer.py in analyze(url, sitemap_url) 15 site = Website(url, sitemap_url) 16 ---> 17 site.crawl() 18 19 for p in site.crawled_pages:

C:\ProgramData\Anaconda3\lib\site-packages\seoanalyzer\website.py in crawl(self) 63 continue 64 ---> 65 page.analyze() 66 67 self.content_hashes[page.content_hash].add(page.url)

C:\ProgramData\Anaconda3\lib\site-packages\seoanalyzer\page.py in analyze(self, raw_html) 170 return 171 else: --> 172 raw_html = page.data.decode('utf-8') 173 174 self.content_hash = hashlib.sha1(raw_html.encode('utf-8')).hexdigest()

UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 31608-31609: invalid continuation byteAdd any other context about the problem here.

Uziel9999 avatar Aug 04 '20 23:08 Uziel9999