crawl-original-google-images
crawl-original-google-images copied to clipboard
python scripts for crawling original image from Google Images
Crawl Original Google Images & Youtube Videos
This repo contains code to crawl images and videos:
- ORIGINAL images from Google Search
- ORIGINAL videos from Youtube
Requirements
-
ChromeDriver
- Check your current Google Chrome Version
- Download ChromeDriver corresponding to your Chrome Version at ChromeDriver, unzip it.
For example, I'm using Chrome Version
95.0.4638.69
, Linux, so I downloadedchromedriver_linux64.zip
-
Enviroments
conda env create -f environment.yml
Crawl Images from Google Image Search
Download original (not thumbnails) from Google Images Search with multi-threading :D
- Get URLs by keywords
python crawl_url.py
- Download imgs from URLs
python crawl_data.py
Crawl Videos from Youtube
- Get URLs by keywords
python crawl_youtube_link.py
- Download videos from URLs
python crawl_videos.py python crawl_videos.py --metadata --thumbnail # thumbnail and metadata only
To-do
- [x] Init
- [x] Multithreading
- [x] Requiremets
- [x] Write Guideline
- [ ] Add parser to save_dirs, chromedriver, etc.