bulk-downloader-for-reddit
bulk-downloader-for-reddit copied to clipboard
Downloading a user's posts in a certain subreddit
- [x] I am requesting a feature.
- [x] I am running the latest version of BDfR
- [x] I have read the Opening an issue
Description
When I want to filter a user's posts by subreddit I usually download everything and then i delete all the folders of the subreddits I don't want, but this way it's more work and I'm more restricted by the 1000 post limit.
I want to be able to do something like
python3 -m bdfr download ./RedditDownloads --user [username] --submitted --subreddit [subreddit]
or
python3 -m bdfr download ./RedditDownloads --subreddit [subreddit] --user [username] --submitted
(which now just download all the subreddit posts)
and get only the user's posts on that subreddit
Maybe this is already possible and i'm just stupid. If so, pls lmk.
The BDFR cannot get more than 1000 posts for a request. The --subreddit
option just adds a subreddit source; it doesn't in any way combine with the --user
option.
Now, I can write a feature to filter based of subreddit but it won't get you any more posts and I'll need to gauge the interest in the feature.
You may already be aware, but from the 1000 posts pulled from a subreddit, the JSON file names are all prefixed with the username by default. So while you will not get up to 1000 posts from only that user into that subreddit.. out of the last 1000, you could easily filter for files (posts) that contain a certain username.
A way you could approach this could be scrape the user, and then purge the files not named from the subreddit in question?
not a fix, but possibly simpler workaround re clean-up?
e.g. bdfr download [folder] --user [username] --submitted --no-dupes --folder-scheme {REDDITOR} --file-scheme {SUBREDDIT}_{TITLE}_{POSTID}
It's not perfect, but might help?
This is possible with a whole bunch of things but people don't seem to enjoy the prospect of downloading data they don't need. I'm thinking a more advanced filtering system for the BDFR would be good to add in the future.