zhihu-python
zhihu-python copied to clipboard
获取知乎内容信息,包括问题,答案,用户,收藏夹信息
能用吗
验证码问题
这个好像不能过验证码这关?
use @decorators to simplify the structure of code 以 @decorators 装饰器为例,来简化操作。避免在每次调用函数前用 if 语句来判断是否执行 parse() 函数。 仅仅是一个 demo:-) 以 get_post_count() 作为例子。如果 @egrcc 觉得这种方式可以推广的话欢迎采纳~
title = soup.find("h2", class_="zm-item-title").string.encode("utf-8").replace("\n", "") AttributeError: 'NoneType' object has no attribute 'string'
我在win7,64bit,python2.7.10下运行,遇到汉字的时候都是乱码。。。
网络异常
用户失效后重新登录报错: WARN:网络故障 INFO: 正在加载配置文件 ... Traceback (most recent call last): File "C:/Users/Administrator/PycharmProjects/��Ӱ/���ݽṹ���㷨/auth.py", line 241, in login() File "C:/Users/Administrator/PycharmProjects/��Ӱ/���ݽṹ���㷨/auth.py", line 213, in login form_data = build_form(account, password) File "C:/Users/Administrator/PycharmProjects/��Ӱ/���ݽṹ���㷨/auth.py", line 115,...
question.get_top_i_answers() 和 question.get_all_answers() 都只能获得10 个回答了,测试了几个问题,包括test 中的 [http://www.zhihu.com/question/24269892](http://www.zhihu.com/question/24269892") `url = "http://www.zhihu.com/question/24269892" question = Question(url) print question.get_answers_num() q=question.get_top_i_answers(20) for i in q: print i` ` ## IndexError Traceback (most recent call last)...
错误报告如下,应该是知乎换了格式了 Traceback (most recent call last): File "./crawl.py", line 19, in print "\nQuestion " + ": " + answer.get_question().get_title() + "\n" File "filepath/zhihu.py", line 1070, in get_question question_link = soup.find("h2",...
`def search_xsrf(): """ ``` :rtype: object """ url = "http://www.zhihu.com/" r = requests.get(url, verify=False) if int(r.status_code) != 200: raise NetworkError(u"验证码请求失败")` ``` 这里一直返回500,错误就是验证码请求失败