JiaYuan
JiaYuan copied to clipboard
user profile of jiayuan.com
JiaYuan Spider and Data Analysis
Introduction
- scrape data from shijijiayuan with BeautifulSoup and requests in Python3.5
- machine learning algorithm in R
- visualize data and generate report in in MS PowerPoint2016, R ggplot2, TAGUL
Prerequisites
- Python3.X (Python 3.5 is recommended)
- 3rd party library(requests, BeautifulSoup)
Note
- for later research, a Linux OS(Ubuntu 16.04 or CentOS 7 will be fine) is required. If you use Windows, that may bring you some trouble
Results
-
Basic statistics info
-
With NLP
The Next
Next, I want to train this spider with the avatar image set based on Computer Vision, in order to enable this spider has ability to rank your face. Anyone who is interested in computer vision, deep learning please commit your issues.
For more details, please visit my article at Zhihu.
With pleasure!