google-images-download icon indicating copy to clipboard operation
google-images-download copied to clipboard

related_images offset duplicates

Open AmineHosni opened this issue 4 years ago • 2 comments

I'm using the command line here: I tried to download 10000 images with their related_images I got only 375 image on the main folder and 759 related ones on the first iteration Now to get back to where it stopped,I rerun the command with 375 as an offset main folder's new downloads: 365 -> 797 related_images folder's : 365 -> 779 (not a typo) I ended up with duplicates on the related_images'folder and it messed up the numeration

git

Would be nice if there was an offset argument related to the optional related_images argument to avoid duplicates.

AmineHosni avatar Jul 31 '19 23:07 AmineHosni

I'm using the command line here: I tried to download 10000 images with their related_images I got only 375 image on the main folder and 759 related ones on the first iteration Now to get back to where it stopped,I rerun the command with 375 as an offset main folder's new downloads: 365 -> 797 related_images folder's : 365 -> 779 (not a typo) I ended up with duplicates on the related_images'folder and it messed up the numeration

git

Would be nice if there was an offset argument related to the optional related_images argument to avoid duplicates.

Hi,@AmineHosni: There is a function like you said "offset". You could just add the word "-of 650" (it means start download from the 650th picture the number you set)

ActonMartin avatar Aug 16 '19 09:08 ActonMartin

Hi @ActonMartin, I'm sorry if I wasn't clear. I know about the offset argument, actually that's what I used to resume my downloads. The problem is that when you want to download related images, the offset argument will only keep track of the main images, not the related ones, so you'll end up with duplicates on the related ones. Like my suggestion was "how about having an additional offset to also keep track of the related images when they're used"

AmineHosni avatar Aug 20 '19 23:08 AmineHosni