Web-Scraping
Web-Scraping copied to clipboard
Learn how to leverage Python's amazing tools to scrape data from other websites. The end goal of this course is to scrape blogs to analyze trending keywords and phrases. We'll be using Python 3.6, R...
Learn how to leverage Python's amazing tools to scrape data from other websites.
The end goal of this course is to scrape blogs to analyze trending keywords and phrases.
We'll be using Python 3.6, Requests, BeautifulSoup, Asyncio, Pandas, Numpy, and more!
Section 1: Your First Scraping Program
Watch here
Final code is first-web-scraping-program.zip
Install Guides
Windows: https://kirr.co/6r8wr9
Mac: https://kirr.co/386c7f
Linux: https://kirr.co/c3uvuu
Goals of Your First Scraping Program:
- Enter any url (webpage)
- Open and scrape that webpage's words each word
- Save that info into a csv
Third party Packages
-
Python Requests : http://docs.python-requests.org/en/master/
pip install requests
Basically, it opens the webpage for us in this one.
-
BeautifulSoup 4 : https://www.crummy.com/software/BeautifulSoup/bs4/doc/
pip install beautifulsoup4
This allows us to search & extract content from an HTML webpage
Section 2: Advancing Scraping
Goals of Advancing Scraping:
- Refine scraping code
- Scrape Links
- Add Scrape Depth
- Scrape & Parse words in a Post
1 - Welcome
2 - Get URL Input
3 - Regular Expression Validation
4 - Force Quit Program
5 - Usability
6 - Fetch URL
7 - Soupify
8 - Extract Data
9 - Parse Links
10 - Get Local Paths
11 - Local Paths by Regular Expression
12 - Some Lookup Errors
13 - Scrape Local Paths
14 - Parse Words
15 - Python Set
16 - A Recursive Function
17 - Mock Fetching
18 - All together
Section 3: Asyncio & Web Scraping
code coming soon