ripme
ripme copied to clipboard
URL history is written before actual file is written.
This issue relates also to the ripme reddit lockup issue. While it does not cause the issue, it does obscure it. I'll explain.
-
You download reddit user x
-
ripme attempts to download image y
-
ripme writes url history for image y
-
ripme hangs on the image due to IO bad path name. File is not written to disk.
-
You stop the rip, and re enter the same URL to see if it will work the second time.
-
The second time it has no issue. YAY (wrong)
On the second attempt ripme see's that the images were already downloaded via url history, so it does not attempt to download the images that caused the hang up which were not actually downloaded the first time.
Other important details for testing. If you are ripping one reddit user, it hangs and you hit stop and close the program. The url of one the problem images may not be saved so you would get a freeze again when you rip that user again.
Though if you rip another user/url (from queue or copy & paste) after hitting stop without closing ripme, the url history will be written for those problem images and the next time you go to rip that user ripme will no longer attempt to download those images which were not actually written to the hard drive.
I have many empty folders from many rips which I couldn't figure out until now. This may go back to when the 255 char limit for windows was fixed, maybe illegal character fix. This url history issue has probably been covering some/many of them up the entire time.
#1736 #1536 #1017
RipMe v1.7.93
java version "1.8.0_261" Java(TM) SE Runtime Environment (build 1.8.0_261-b12) Java HotSpot(TM) 64-Bit Server VM (build 25.261-b12, mixed mode)
It didn't occur to me this could be blocking the ripping, so you got a good point.
I think 1 good change to make would be to make the URL saving an async process, and if something happens with it, report it in the log view and log file, if it can be caught.
And use a variable or some other surefire way to ensure that if that thread is somehow stuck, RipMe won't try running multiple simultaneous attempts at the URL list that'll just cause resource exhaustion in due time, which'll be very bad. Dunno if it's possible to do something about those stuck threads so they'll be killed if RipMe is shut down, and dunno if it's the OS' job.