google-images-download icon indicating copy to clipboard operation
google-images-download copied to clipboard

hangs on 'Evaluating' when using selenium/chromedriver to download >100 images

Open danmcquillan opened this issue 6 years ago • 8 comments

the script works fine when downloading less than 100 images. however when i try to download more than 100 it hangs on 'Evaluating'. i have installed the latest version of google-images-download and the most recent chromedriver. i am running ubuntu 16.04 when i interrupt with ^C, here is the output:

googleimagesdownload -k happy -o happy -l 200 -cd "/usr/bin/chromedriver"

Item no.: 1 --> Item name = happy Evaluating... ^CTraceback (most recent call last): File "/home/dan/Dropbox/tech/fastai/dan_fastai/venv-fastai/lib/python3.5/site-packages/urllib3/connectionpool.py", line 377, in _make_request httplib_response = conn.getresponse(buffering=True) TypeError: getresponse() got an unexpected keyword argument 'buffering'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/dan/Dropbox/tech/fastai/dan_fastai/venv-fastai/bin/googleimagesdownload", line 11, in sys.exit(main()) File "/home/dan/Dropbox/tech/fastai/dan_fastai/venv-fastai/lib/python3.5/site-packages/google_images_download/google_images_download.py", line 904, in main paths = response.download(arguments) #wrapping response in a variable just for consistency File "/home/dan/Dropbox/tech/fastai/dan_fastai/venv-fastai/lib/python3.5/site-packages/google_images_download/google_images_download.py", line 853, in download raw_html = self.download_extended_page(url,arguments['chromedriver']) File "/home/dan/Dropbox/tech/fastai/dan_fastai/venv-fastai/lib/python3.5/site-packages/google_images_download/google_images_download.py", line 169, in download_extended_page browser = webdriver.Chrome(chromedriver, chrome_options=options) File "/home/dan/Dropbox/tech/fastai/dan_fastai/venv-fastai/lib/python3.5/site-packages/selenium/webdriver/chrome/webdriver.py", line 81, in init desired_capabilities=desired_capabilities) File "/home/dan/Dropbox/tech/fastai/dan_fastai/venv-fastai/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 157, in init self.start_session(capabilities, browser_profile) File "/home/dan/Dropbox/tech/fastai/dan_fastai/venv-fastai/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 252, in start_session response = self.execute(Command.NEW_SESSION, parameters) File "/home/dan/Dropbox/tech/fastai/dan_fastai/venv-fastai/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 319, in execute response = self.command_executor.execute(driver_command, params) File "/home/dan/Dropbox/tech/fastai/dan_fastai/venv-fastai/lib/python3.5/site-packages/selenium/webdriver/remote/remote_connection.py", line 374, in execute return self._request(command_info[0], url, body=data) File "/home/dan/Dropbox/tech/fastai/dan_fastai/venv-fastai/lib/python3.5/site-packages/selenium/webdriver/remote/remote_connection.py", line 397, in _request resp = self._conn.request(method, url, body=body, headers=headers) File "/home/dan/Dropbox/tech/fastai/dan_fastai/venv-fastai/lib/python3.5/site-packages/urllib3/request.py", line 72, in request **urlopen_kw) File "/home/dan/Dropbox/tech/fastai/dan_fastai/venv-fastai/lib/python3.5/site-packages/urllib3/request.py", line 150, in request_encode_body return self.urlopen(method, url, **extra_kw) File "/home/dan/Dropbox/tech/fastai/dan_fastai/venv-fastai/lib/python3.5/site-packages/urllib3/poolmanager.py", line 323, in urlopen response = conn.urlopen(method, u.request_uri, **kw) File "/home/dan/Dropbox/tech/fastai/dan_fastai/venv-fastai/lib/python3.5/site-packages/urllib3/connectionpool.py", line 600, in urlopen chunked=chunked) File "/home/dan/Dropbox/tech/fastai/dan_fastai/venv-fastai/lib/python3.5/site-packages/urllib3/connectionpool.py", line 380, in _make_request httplib_response = conn.getresponse() File "/usr/lib/python3.5/http/client.py", line 1197, in getresponse response.begin() File "/usr/lib/python3.5/http/client.py", line 297, in begin version, status, reason = self._read_status() File "/usr/lib/python3.5/http/client.py", line 258, in _read_status line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1") File "/usr/lib/python3.5/socket.py", line 575, in readinto return self._sock.recv_into(b) KeyboardInterrupt

danmcquillan avatar Nov 14 '18 08:11 danmcquillan

apologies. the problem here was using the wrong version of chromedriver for my version of chrome. i have downgraded chromedriver and everything now works as expected.

danmcquillan avatar Nov 21 '18 17:11 danmcquillan

hi could you help me with the similar problem? i have download the correct version of chromedriver, but the script can not download anything when the parameter "-l" more than 100.

`googleimagesdownload --keywords "star" -l 101 --chromedriver "/home/yyy/Downloads/chromedriver"

Item no.: 1 --> Item name = star Evaluating... Getting you a lot of images. This may take a few moments... Reached end of Page. Starting Download...

Unfortunately all 101 could not be downloaded because some images were not downloadable. 0 is all we got for this search filter!

Errors: 0

Everything downloaded! Total time taken: 146.7971887588501 Seconds`

AddASecond avatar Dec 12 '18 03:12 AddASecond

apologies. the problem here was using the wrong version of chromedriver for my version of chrome. i have downgraded chromedriver and everything now works as expected.

@danmcquillan Hello, can you tell me which chromedriver i should use? my chrome version is 73.0.3683.75 (Official Build) (64-bit), and the chromedriver is 73.0.3683.68(not found 75), but still meet the hang on problem~thank you!

stoneeve415 avatar Apr 22 '19 16:04 stoneeve415

@RobertAuditore i've met same problem when i try to download more than 100 pics, have you solved it now?

zhra46 avatar May 28 '19 04:05 zhra46

@RobertAuditore i've met same problem when i try to download more than 100 pics, have you solved it now?

No, I haven't solve it yet. I choose another repo

AddASecond avatar May 28 '19 10:05 AddASecond

@RobertAuditore i've met same problem when i try to download more than 100 pics, have you solved it now?

No, I haven't solve it yet. I choose another repo

Which repo? could you tell me? Please.

ActonMartin avatar Aug 15 '19 07:08 ActonMartin

@RobertAuditore i've met same problem when i try to download more than 100 pics, have you solved it now?

No, I haven't solve it yet. I choose another repo

Which repo? could you tell me? Please.

I forgot the specific repo, but it's one of my collections here :+1: https://github.com/AddASecond/AnnoRepo in the scrapy/crawl part

AddASecond avatar Sep 22 '19 03:09 AddASecond

Thanks, got google_images_download working on my Macbook Pro running Mojave (see instructions below):

I was getting the following message when trying to download more than 100 images: Looks like we cannot locate the path the 'chromedriver' (use the '--chromedriver' argument to specify the path to the executable.) or google chrome browser is not installed on your machine (exception: expected str, bytes or os.PathLike object, not NoneType)

  1. I downloaded chromedriver from here: https://chromedriver.chromium.org/downloads I chose Version 79 to match my installed version of Chrome. (See menu: Chrome->About Google Chrome)

  2. I unzipped chromedriver_mac64.zip in my Downloads folder

  3. I moved chromedriver to /usr/local/bin/chromedriver (You may or may not need to use sudo, depending on whether you own /usr/local.) $ sudo mv chromedriver /usr/local/bin

  4. In my python script, I added another entry in the arguments dict passed to google_images_download.googleimagesdownload(): { ... "chromedriver": "/usr/local/bin/chromedriver" }

Perhaps the google-images-download "Installation" page should be updated with chromedriver information. Or, perhaps a note, "If you plan to download more than 100 images, you will need to install chromedriver (see Troubleshooting page)."

Screen Shot 2020-01-04 at 10 50 48 PM

ridgemcghee avatar Jan 05 '20 06:01 ridgemcghee