Better history management

Open KaunWho opened this issue 4 years ago • 1 comments

As far as my understanding of how the downloader works is you provide it sub names, it searches the sub based on the criteria provided and downloads the data as requested. Every time I restart the download, it will restart from the beginning, de duplicating files and the whole 9 yards. In case I have not provided any start-end dates or a timeframe the download would try to get everything from the Sub before the API taps out at a 1000 requests.

I was wondering if you could create a management mechanism where the a json file is created for each sub added individually in each folder. One main json file will direct to individual mini (subreddit specific) json. It will check for the first 50 posts without me manually changing the timeframe and then if the file is already downloaded it will ignore it.

A better working example for the above mechanism is AssetKid's Raider. Works like a charm. Since you are already re-writing the program, I thought I would just share this. No worries if you cannot do this at the moment. Happy coding!

Raider: https://github.com/AssetKid/raider-release

Jun 17 '21 14:06 KaunWho

I would like to make RMD's rewrite smarter about resuming from previous downloads, but there are some difficulties around the many ways RMD allows you to filter posts, and Reddit's API not supporting combining those with date-time ranges. Currently, RMD manages this by simply skipping duplicate IDs, without needing to do any further downloading or processing.

However, I would like to implement automatic time-based skipping wherever possible, so I will see what can be done to support this sort of feature moving forward.

Jun 17 '21 23:06 shadowmoose