icrawler
icrawler copied to clipboard
A multi-thread crawler framework with many builtin image crawlers provided.
For me the GoogleImageCrawler of icrawler doesn't work anymore. I updated the user agent in `crawler.py` since that seemed to work in the past, but no luck here. I tried...
I'm currently trying to use icrawler to download images working through a list for each of its keywords. The issue I'm having is that as I only want one picture...
``` Traceback (most recent call last): File "/home/minami/anaconda3/envs/python_function/lib/python3.7/threading.py", line 926, in _bootstrap_inner self.run() File "anaconda3/envs/python_function/lib/python3.7/threading.py", line 870, in run self._target(*self._args, **self._kwargs) File "anaconda3/envs/python_function/lib/python3.7/site-packages/icrawler/parser.py", line 104, in worker_exec for task in...
I was trying to get some images for a class project and didn't want to manually download 100 images and your software came to the rescue. Excellent work, nice coding
哈喽!默认爬下来的是国内版必应结果,我想用国际版的结果应该怎么操作呀?非常感谢!!!!!  还有,有的图片是png格式,如何保存透明背景的png图片?感谢! `from icrawler.builtin import BingImageCrawler ing_crawler = BingImageCrawler(downloader_threads=4, storage={'root_dir': 'your_image_dir'}) bing_crawler.crawl(keyword='clip+art+technology', filters=None, offset=0, max_num=1000) `
search image by date doesnt work, the paramter of Ccd_min and Ccd_max useless, please fix it
Hi, when I used the searching URLs generated by `feed()` function in `GoogleFeeder`, I can only get around 100 images although the `max_num=1000`. I find that all the URLs get...
Hi there! When I use `GoogleImageCrawler`, I get sometimes `png` and sometimes `jpg` files, depending on what google finds. Is there a way to configure the crawler to only download...
I am trying to create a ML dataset I set max_num to 200 and key='face' it only downloads 109 images , I go to result page where images were being...
Hey! I am trying to download 3000 images for each keyword using BingImageCrawler but I am getting cut off at < 1000 images per keyword. The documentation says `To crawl...