RedditExtractor
RedditExtractor copied to clipboard
Adding page limit and thread limit
context
When retrieving user or subreddit content, it would be helpful to restrict the number of pages/threads requested. Sometimes it is not necessary to request all pages, comments and so on. This can speedup the workflow by not waiting around on unnecessary results. Especially since the API calls are rather slow due to the internal restriction of 1 API call per second.
proposed feature change
A desirable API change could be:
# only retrieve the first two pages of r/cats
find_thread_urls(subreddit="cats", sort_by="top", page_threshold = 2)
# only retrieve the first page of u/nationalgeographic
get_user_content("nationalgeographic", page_threshold = 1)
# limit the thread to max 10 comments
get_thread_content(url, limit = 10)
# limit to the 5 best matching subreddits
find_subreddits("cats", limit = 5)