RedditImageGrab icon indicating copy to clipboard operation
RedditImageGrab copied to clipboard

bugs

Open kanink007 opened this issue 8 years ago • 10 comments

hello there first of all: great script! i love using it. there are a few bugs. well first of all, it only downloads 1000 images (someone already posted that issue). buts thats okay, because you can add --num 20000 (just an example) and it won't stop at 1000. another bug: i cant download subreddits with sort type new, hot, rising and gilded top and controversial are working

i get this error if i try to download new, hot rising or gilded (using windows 7 command line):

A:\Python27>python redditdl.py example A:\Python27\here --sort-type new --num 10 Downloading images from "example" subreddit Traceback (most recent call last): File "redditdl.py", line 14, in main() File "A:\Python27\redditdownload\redditdownload.py", line 393, in main reddit_sort=ARGS.sort_type) File "A:\Python27\redditdownload\reddit.py", line 80, in getitems if is_advanced_sort: UnboundLocalError: local variable 'is_advanced_sort' referenced before assignment

it would be nice if you could help me

greetings

kanink007 avatar Feb 20 '16 05:02 kanink007

hi, i have made simple fix which will fix this. can you try this and check if the bug is still exist?

rachmadaniHaryono avatar Mar 21 '16 13:03 rachmadaniHaryono

hello. thanks for helping. it worked, now it is possible to download with sort type new..etc. but there is a new problem now. i tried to download images. it downloaded 937 and the next images were all "image already downloaded". that cant be because there are more different images than 937. so i deleted the downloaded files and started it anew. still the same images. now it downloaded 220 images and all others were "image already downloaded".

kanink007 avatar Mar 23 '16 02:03 kanink007

can you provide link and command example?

how do you know there is more than 937 different images? are those images given by only single reddit thread(no images album, single thread=single image)? is your download folder empty? is the second 220 images in second download is different with the first 937 images in first download? how many 'downloaded file'(which is not downloaded) are sklpped in first and second downoad?

from your description i can only guess either it come numbering mistake on imgurl album or another new problem on link getter from subreddit. it can also when there is repost reddit thread and file-format title or url choosen.

rachmadaniHaryono avatar Mar 23 '16 09:03 rachmadaniHaryono

its still the same thread. lets there there ist reddit/r/example with 5000 images. i want to download it with sort-type new. it downloads the first 937 and then suddenly says for the next images: this image was already downloaded. ten i try again, its still reddit/r/example. and now it downloads only the first 220 images and then suddenly says: image already downloaded. and yes, my folder were empty. the number of good downloads are randomly. i cant tell when it starts to say "image already downloaded".

just to be sure i didnt do something worn i downloaded reddit/r/example with sort-type topall. result: no problems. there is no "images already downloaded"-error.

here is a command example python redditdl.py redditexample x:\Python27\RedditImageGrab-master\downloads --sort-type new --num 5000

kanink007 avatar Mar 23 '16 21:03 kanink007

which subreddit is it?

and can you try reproduce it once again with this branch https://github.com/rachmadaniHaryono/RedditImageGrab/tree/add-logger-module

i create logging module so the report can be exported into text file.

use '--logging-level debug --logging-file some_textfile.txt' argument

rachmadaniHaryono avatar Mar 24 '16 09:03 rachmadaniHaryono

i tried is with other reddits too. its the same error for all reddits. as i said: sort type top works fine. sort type new doesnt work. i looked up the download log (thanks for the logger module. i tihkn i wouldnt have seen it otherwise). its not looking for new images. the downloader gets stuck in a loop and trying to download the previous downloaded images again and then it says "image already downloaded". well of course these images are already downloaded. but it should look for the already downloaded ones, it should look for the next new images. i tried dogpictures reddit. after the first 25 images, it gets stuck in a loop and repeating the already downladed images.

edit: i tried the download without any sort type command. it works fine. i think there is something with the sort type new

some_textfile.txt

kanink007 avatar Mar 24 '16 18:03 kanink007

by the way: same error for hot, rising and gilded.

  • now i know the real problem. but i dont know why there is that problem or how to solve it. here you go: when using hot, new, rising or gilded sort-type, the script only downloads all pictures/albums from the first page. reddit only show 25 posts each page. the dogpicture reddit first page contains only 25 images within 25 posts. (1 image each post. and yes, the last downloaded dog pic is exactly the last picture in the 25th post.). (the reddits i used before contained many albums in the first 25 posts. thats why it downloaded 900+ at once. well that reddit got an update i think, so one of the album with many images wasnt in the first 25 posts anymore, so the download dropped to 200+ images)

to be short...its a "first 25 posts download limit"-problem (that problem occurs only for new, risng, hot, gilded sort-type. using top/controversial + time limits like week, all, month etc, or no sort type - works)

kanink007 avatar Mar 24 '16 18:03 kanink007

another fix. this time i change the last id value. before this last id got its value from item, but after for loop variable 'item have its value back to None. therefore i change it to get the value from last item in 'items' list. this may cause other problem because afaik list don't have keep the its item order.

i also check if it may fail at the end of request ie end of subreddit, and it is success without any fail.

./redditdl.py --num 50 konosuba 

this is cherry pick from fix-1k-dl branch

rachmadaniHaryono avatar Mar 29 '16 02:03 rachmadaniHaryono

still getting the same error. "image already downloaded". still only the first 25 posts downloaded. and what do you mean with

./redditdl.py --num 50 konosuba

if you tested it with that code, its the wrong one. you didnt add --sort-type new

kanink007 avatar Mar 31 '16 05:03 kanink007

./redditdl.py --num 50 konosuba

if you tested it with that code, its the wrong one. you didnt add --sort-type new

no, it is only to check if the fix can handle the end of the subreddit.

the fix is only on branch #42 , and the new commit is to fix the '?' char at the end json url that will be used to request data from reddit.

you can check using that branch or if you want to make it faster, you can also use fix-1k-dl branch with following argument './redditdl.py --debug empty-download-file --logging-level debug --logging-file 'log.txt' --num 50 --sort-type new dogpictures'. that will only download empty file, rather than downloading from imgur.

rachmadaniHaryono avatar Mar 31 '16 07:03 rachmadaniHaryono