imagedownloader
imagedownloader copied to clipboard
Doesn't work
The python example in the readme doesn't work. Detailed debug:
==============================================================================
Image downloader called with the following arguments :
{'debug': True,
'headers': {'Accept': '*/*',
'Accept-Encoding': 'gzip, deflate',
'Connection': 'keep-alive',
'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:55.0) '
'Gecko/20100101 Firefox/55.0'},
'logfile': None,
'max_wait': 0.0,
'min_wait': 0.0,
'n_workers': 50,
'notebook': False,
'proxies': None,
'store_path': PosixPath('/Users/jeffhu/.datasets/images'),
'timeout': 5.0,
'user_agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:55.0) '
'Gecko/20100101 Firefox/55.0'}
==============================================================================
{"asctime": "2020-04-12 12:29:44,144", "name": "imgdl.downloader", "levelname": "ERROR", "message": "Failed", "success": false, "url": "https://upload.wikimedia.org/wikipedia/commons/8/8b/Moh_%284%29.jpg", "Exception": {"type": "<class 'TypeError'>", "msg": "object of type 'NoneType' has no len()"}}
{"asctime": "2020-04-12 12:29:44,144", "name": "imgdl.downloader", "levelname": "ERROR", "message": "Failed", "success": false, "url": "https://upload.wikimedia.org/wikipedia/commons/9/92/Moh_%283%29.jpg", "Exception": {"type": "<class 'TypeError'>", "msg": "object of type 'NoneType' has no len()"}}
{"asctime": "2020-04-12 12:29:44,145", "name": "imgdl.downloader", "levelname": "ERROR", "message": "Failed", "success": false, "url": "https://upload.wikimedia.org/wikipedia/commons/c/cd/Rostige_T%C3%BCr_P4RM1492.jpg", "Exception": {"type": "<class 'TypeError'>", "msg": "object of type 'NoneType' has no len()"}}
100%|████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 18183.40it/s]
{"asctime": "2020-04-12 12:29:44,147", "name": "imgdl.downloader", "levelname": "WARNING", "message": "3 images failed to download"}
So I installed my version using "pip install imgdl" and it didn't work. I proceeded to install a fork of this repo [1] with "python setup.py install", and it asked me to install bs4 as well as place a copy of the chromedriver executable in the folder and then it installed. Rerunning the readme test worked. I don't know whether or not it was the fork'ed version or the different installation method that changed it, but it ended up working. I suspect it is the later.
[1] https://github.com/shoarora/imagedownloader
I installed it and I confirm not working !
It works fine in a Colab notebook here.
For a recap of what happened:
- I have discovered this tool thanks to the following fork: https://github.com/alcinos/imagedownloader
- Because of this issue, I thought the original tool would not work
- So, I have tried the fork (using
pip install git+https://github.com/alcinos/imagedownloader.git
) and it worked fine. - Out of curiosity, I have tried the original repository (using
pip install imgdl
) and it works too. - Finally, here I am telling people who stumble on this issue to try the original tool nonetheless.
For reference, I am talking about version 1.1.0 released in 2018: https://pypi.org/project/imgdl/#history
Let try with imdl version 1.0.0
pip install imgdl==1.1.0
you may be interested to try https://github.com/rom1504/img2dataset which can download 100M pictures in 20h
I have got to admit that your tool is much more efficient than imgdl
.