zhihu-python
zhihu-python copied to clipboard
question.get_top_i_answers() 和 question.get_all_answers() 获取答案失败
question.get_top_i_answers() 和 question.get_all_answers() 都只能获得10 个回答了,测试了几个问题,包括test 中的 http://www.zhihu.com/question/24269892
`url = "http://www.zhihu.com/question/24269892"
question = Question(url)
print question.get_answers_num()
q=question.get_top_i_answers(20) for i in q: print i`
`<zhihu.Answer instance at 0x10441a1b8> <zhihu.Answer instance at 0x115b378c0> <zhihu.Answer instance at 0x1177319e0> <zhihu.Answer instance at 0x1150dab00> <zhihu.Answer instance at 0x117833b90> <zhihu.Answer instance at 0x11587d368> <zhihu.Answer instance at 0x1159a6e18> <zhihu.Answer instance at 0x11698bf80> <zhihu.Answer instance at 0x115614c68>
<zhihu.Answer instance at 0x1174d1170>
IndexError Traceback (most recent call last)
/Users/traveltao/Desktop/zhihu-python-master/zhihu.py in get_top_i_answers(self, n) 460 j = 0 461 answers = self.get_all_answers() --> 462 for answer in answers: 463 j = j + 1 464 if j > n:
/Users/traveltao/Desktop/zhihu-python-master/zhihu.py in get_all_answers(self) 342 343 is_my_answer = False --> 344 if soup.find_all("div", class_="zm-item-answer")[j].find("span", class_="count") == None: 345 my_answer_count += 1 346 is_my_answer = True
IndexError: list index out of range `
同出现该问题,上述两种方法都试过了,都只能获取开头前十个问题然后报错 报错如下: if soup.find_all("div", class_="zm-item-answer")[j].find("span", class_="count") == None: IndexError: list index out of range
有人找到解决办法么?
个人觉得关键问题是找到点击‘更多’按钮后的html数据
但是点击‘更多’按钮后出现的'数据'在源码里仿佛不存在一样
是使用的ajax? 求大牛解决这个bug~
@bugmakesprogress 参考一下https://github.com/Tassandar/zhihu-python,他已经解决了
同样只能获取前10个,有解决办法吗?
https://github.com/egrcc/zhihu-python/pull/73/commits/25b1ba30f75be4eba656bd5b080cf4ea22cda561 这个解决方案试试