ZhihuSocialNetwork icon indicating copy to clipboard operation
ZhihuSocialNetwork copied to clipboard

ZhihuSocialNetwork

A small project crawling data from zhihu and try to extract a social network from the users.

Bootstrap

  • git clone

  • deploy.sh to install dependencies

  • Begin from the root topic. src/crawlTopicTopQuestions.py 19776749

  • Extend the coverage by (can be repeated multiple times)

      src/crawlQuestions.sh
      src/extendAuthorsFromTopQuestions.sh
      src/crawlUsers.sh
      src/crawlQuestions.sh
    
  • Create indices for the mongodb collections