google-images-download
google-images-download copied to clipboard
Can someone merge #298?
Hi everyone, more than a year ago now, I pushed a PR to update google-images-download to work with the new google images data format but it still hasn't been merged.
I see people having the same issue over and over and It would be great if people could use the patch by default instead of me having to refer people to it.
Thanks.
@hardikvasa You may try to connect with the project owner by mail too.
@hardikvasa You may try to connect with the project owner by mail too.
I've tried twice now but no reply sadly.
@hardikvasa You may try to connect with the project owner by mail too.
I've tried twice now but no reply sadly.
Maybe now is the time, someone should fork this project to another responsory.
At this point, there are too many patches that haven’t been merged yet. I just had to manually add a string sanitization fix for the filenames that should have been merged months ago as well. It would be reasonable to think of someone else becoming responsible for maintaining this repository from now on.
yeah filename have problem "UnicodeEncodeError on an image...trying next one... Error: 'latin-1' codec can't encode characters in position 0-7: ordinal not in range(256)"
@Joeclinton1 not about this specific issue, but I'm noticing that offset seems to be not working as intended, or maybe I'm being stupid. If I'm reading the docs correctly, having limit = 2, and offset = 1, would mean that only the second image gets downloaded, but with limit = 2, both are downloaded no matter what offset is. Any help would be great, thanks! example: this downloads 2 images
def testing():
response = google_images_download.googleimagesdownload() #class instantiation
#space in those quotes makes a space between words in joined string
arguments = {"keywords":'lawn chair',"limit":2,"print_urls":True, "no_directory":True, "offset":1} #creating list of arguments
paths = response.download(arguments) #passing the arguments to the function
print(paths)
testing()
@estuhr1206 I have the exact same issue. In the source of the version by @Joeclinton1 offset does not appear to be used.
look like now it can't click 'SHOW MORE' when you reached end of page so only
340 is all we got for this search filter!
@estuhr1206 @jeroenvuurens I've never used offset so I'm not sure what it does. But if it's not in the source code of my patch then it wasn't in the original version I forked. They haven't pushed any changes for a long time so it likely wasn't a feature from the start. You're free to fork my version and add offset if you want. I'll merge your changes if you do that.
with python 3.8 now it have
Getting you a lot of images. This may take a few moments...
Reached end of Page.
Traceback (most recent call last):
File "c:\users\gen32uc\appdata\local\programs\python\python38\lib\runpy.py", line 193, in _run_module_as_main
return _run_code(code, main_globals, None,
File "c:\users\gen32uc\appdata\local\programs\python\python38\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "C:\Users\GEN32UC\AppData\Local\Programs\Python\Python38\Scripts\googleimagesdownload.exe\__main__.py", line 7, in <module>
File "c:\users\gen32uc\appdata\local\programs\python\python38\lib\site-packages\google_images_download\google_images_download.py", line 1140, in main
paths, errors = response.download(arguments) # wrapping response in a variable just for consistency
File "c:\users\gen32uc\appdata\local\programs\python\python38\lib\site-packages\google_images_download\google_images_download.py", line 958, in download
paths, errors = self.download_executor(arguments)
File "c:\users\gen32uc\appdata\local\programs\python\python38\lib\site-packages\google_images_download\google_images_download.py", line 1085, in download_executor
images, tabs = self.download_extended_page(url, arguments['chromedriver'])
File "c:\users\gen32uc\appdata\local\programs\python\python38\lib\site-packages\google_images_download\google_images_download.py", line 317, in download_extended_page
images += self._image_objects_from_pack(self._extract_data_pack_ajax(chunk))
File "c:\users\gen32uc\appdata\local\programs\python\python38\lib\site-packages\google_images_download\google_images_download.py", line 196, in _extract_data_pack_ajax
return json.loads(lines[3] + lines[4])[0][2]
File "c:\users\gen32uc\appdata\local\programs\python\python38\lib\json\__init__.py", line 357, in loads
return _default_decoder.decode(s)
File "c:\users\gen32uc\appdata\local\programs\python\python38\lib\json\decoder.py", line 340, in decode
raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 1 column 104713 (char 104712)
with python 3.8 now it have
Getting you a lot of images. This may take a few moments... Reached end of Page. Traceback (most recent call last): File "c:\users\gen32uc\appdata\local\programs\python\python38\lib\runpy.py", line 193, in _run_module_as_main return _run_code(code, main_globals, None, File "c:\users\gen32uc\appdata\local\programs\python\python38\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "C:\Users\GEN32UC\AppData\Local\Programs\Python\Python38\Scripts\googleimagesdownload.exe\__main__.py", line 7, in <module> File "c:\users\gen32uc\appdata\local\programs\python\python38\lib\site-packages\google_images_download\google_images_download.py", line 1140, in main paths, errors = response.download(arguments) # wrapping response in a variable just for consistency File "c:\users\gen32uc\appdata\local\programs\python\python38\lib\site-packages\google_images_download\google_images_download.py", line 958, in download paths, errors = self.download_executor(arguments) File "c:\users\gen32uc\appdata\local\programs\python\python38\lib\site-packages\google_images_download\google_images_download.py", line 1085, in download_executor images, tabs = self.download_extended_page(url, arguments['chromedriver']) File "c:\users\gen32uc\appdata\local\programs\python\python38\lib\site-packages\google_images_download\google_images_download.py", line 317, in download_extended_page images += self._image_objects_from_pack(self._extract_data_pack_ajax(chunk)) File "c:\users\gen32uc\appdata\local\programs\python\python38\lib\site-packages\google_images_download\google_images_download.py", line 196, in _extract_data_pack_ajax return json.loads(lines[3] + lines[4])[0][2] File "c:\users\gen32uc\appdata\local\programs\python\python38\lib\json\__init__.py", line 357, in loads return _default_decoder.decode(s) File "c:\users\gen32uc\appdata\local\programs\python\python38\lib\json\decoder.py", line 340, in decode raise JSONDecodeError("Extra data", s, end) json.decoder.JSONDecodeError: Extra data: line 1 column 104713 (char 104712)
Same here with python 3.6.8 when trying to download more than 100 images.
It was fixed with https://github.com/Joeclinton1/google-images-download/pull/8
@Joeclinton1 since yours is the most active fork would you consider allowing issues to be opened against it?
I feel like it's safe to say that @hardikvasa has abandoned the project and hasn't had any github activity in over a year
@hellocatfood I added issues to it, hope this helps!
I am using code derived from google-images-download for a few years now. Google changes their backend frequently, so that whatever you do to make it work, is not very likely to still work after six month. It would be nice if there is a community that keeps this library updated, but if it isn't and you cannot rely on it, you may be better off writing your own code.