weiboSpider icon indicating copy to clipboard operation
weiboSpider copied to clipboard

line 1: b'ID C_5074630370399522 already defined' (line 1)

Open cinyearchan opened this issue 5 months ago • 1 comments

为了更好的解决问题,请认真回答下面的问题。等到问题解决,请及时关闭本issue。

  • 问:请您指明哪个版本运行出错(github版/PyPi版/全部)?

答:pypi 版

  • 问:您使用的是否是最新的程序(是/否)?

答:是

  • 问:爬取任意用户都会运行出错吗(是/否)?

答:否

  • 问:若只有爬特定微博时才出错,能否提供出错微博的weibo_id或url(非必填)?

答:

  • 问:若您已提供出错微博的weibo_id或url,可忽略此内容,否则能否提供出错账号的user_id及您配置的since_date,方便我们定位出错微博(非必填)?

答: user_id 2492465520 since_date 2009-08-28 end_date now usesr_id_list.txt 2492465520 刘晓光_恶魔奶爸 2024-08-22 10:21

  • 问:如果方便,请您描述出错详情,最好附上错误提示。

答:单次爬取过程中出现多次提示:

line 1: b'ID C_5074630370399522 already defined' (line 1)
Traceback (most recent call last):
  File "/Users/xxx/.pyenv/versions/3.9.7/lib/python3.9/site-packages/weibo_spider/parser/util.py", line 42, in handle_html
    selector = etree.HTML(resp.content)
  File "src/lxml/etree.pyx", line 3170, in lxml.etree.HTML
  File "src/lxml/parser.pxi", line 1877, in lxml.etree._parseMemoryDocument
  File "src/lxml/parser.pxi", line 1765, in lxml.etree._parseDoc
  File "src/lxml/parser.pxi", line 1127, in lxml.etree._BaseParser._parseDoc
  File "src/lxml/parser.pxi", line 601, in lxml.etree._ParserContext._handleParseResultDoc
  File "src/lxml/parser.pxi", line 711, in lxml.etree._handleParseResult
  File "src/lxml/parser.pxi", line 649, in lxml.etree._raiseParseError
  File "<string>", line 1
lxml.etree.XMLSyntaxError: line 1: b'ID C_5074630370399522 already defined'

cinyearchan avatar Sep 04 '24 01:09 cinyearchan