ZhihuSocialNetwork
ZhihuSocialNetwork copied to clipboard
ZhihuSocialNetwork
A small project crawling data from zhihu and try to extract a social network from the users.
Bootstrap
-
git clone
-
deploy.sh
to install dependencies -
Begin from the root topic.
src/crawlTopicTopQuestions.py 19776749
-
Extend the coverage by (can be repeated multiple times)
src/crawlQuestions.sh src/extendAuthorsFromTopQuestions.sh src/crawlUsers.sh src/crawlQuestions.sh
-
Create indices for the mongodb collections