reddit-html-archiver
reddit-html-archiver copied to clipboard
Won't archive old posts' comments
When doing fetch_links.py formula1 2014-4-1 2014-4-2
it won't archive any of the comments, just the submissions.
For example: If you go to this Reddit thread, you will see comments, but it won't be archived. https://reddit.com/r/formula1/comments/21tvzs/ Archived version
I'm having the opposite problem, where I get all the comments but not the posts themselves. I also get a lot of nearly-blank .csv files in which the entirety of the contents is just this:
author,body,created_utc,id,link_id,parent_id,score,stickied,subreddit_id
I think you guys have stumbled across some missing data in pushshift. Maybe they'd appreciate a bug report over there. The formula1 example:
- says 280 comments: https://api.pushshift.io/reddit/search/submission/?ids=21tvzs
- returns 0 comments: https://api.pushshift.io/reddit/submission/comment_ids/21tvzs
- but working fine for other posts: https://api.pushshift.io/reddit/submission/comment_ids/a1ieor