RedditDownloader icon indicating copy to clipboard operation
RedditDownloader copied to clipboard

New source: text of comment or post

Open logistic-bot opened this issue 5 years ago • 5 comments

Is your feature request related to a problem? Please describe.

As far as I am aware, there is currently no way to save the text of comments or posts.

Describe the solution you'd like:

Add a new source to download the text of posts, or the text of comments that have been upvoted, or saved by a user.

Additional context:

I use the "save" feature of Reddit extensively for text comments, and much less for media.

I would completely understand if you do not want to add this feature because it is not the project's scope.

logistic-bot avatar Apr 18 '20 19:04 logistic-bot

I'd love this feature as well.

ebrummer avatar Apr 20 '20 21:04 ebrummer

I'll look at this feature since it's been requested a lot, but I'd like some more details.

Ideally, what all would be stored? For instance, in a Submission it would probably be a bad idea to try and store every comment, so it'd likely be limited to storing only the submission text if there is any (this is already done by RMD, we'd just need a UI component). For comments would you just want the comment text itself saved (also already done), or would you want the parent data also stored?

shadowmoose avatar Apr 21 '20 02:04 shadowmoose

I think the best way would be for it to be configurable by the user.

I have a few ideas about what options would be useful:

For submissions:

  • Download at most [xxx] of the [best|top|new|old|contreverisal|Q&A] comments

For comments:

  • Download at most [xxx] child comments.
  • Download at most [xxx] parent threads, limiting each thread to [xxx] comments. (sibling threads could be included in this setting, or have a separate setting)

I know that PowerBee only downloads a maximum of 1750 comments per submissions, so that could be a reasonable default.

logistic-bot avatar Apr 21 '20 18:04 logistic-bot

I've considered this some more, and I can't find a good way to implement this without adding an extra layer of complexity to RMD.

Modifying the existing data structure of RMD to support these new options would be very difficult, since it is not optimized to store that kind (or that much) data. This means that we'd need to instead graft on new code, instead of modifying existing logic.

Adding an extensive list of comment sources, filtering options, and child/parent config would require almost an entirely new list of "Sources", not to mention new database wrappers to store these special posts, and then we'd need a new UI page to display this data sensibly. By the time we've finished those, we're halfway to rewriting all of RMD's existing code. On top of complexity issue, I can't think of a good way to add all these new options to the UI without heavily bloating the config.

In summary, I'd like to add this feature, but I cannot think of a sane way to handle it currently. If anyone has any ideas, I'm all ears. RMD currently does store the content of the Submissions and Comments that it locates when scanning, so I think the best we can do for now is expose that text somewhere in the UI. I'll look into that option soon, since it's not especially difficult to include - I just have to find space for it in the interface.

shadowmoose avatar May 07 '20 09:05 shadowmoose

I'm fine with just downloading the single comment that was saved. Of course, they'd all go under their respective subreddits. Data + Username + Comment text would be fine for me. Basically, what you see when you click the 'permalink' button, without children.

then we'd need a new UI page to display this data sensibly

You could save them on plain text files, skipping the need to display them in the WebUI. ...Unless you meant that a new "in progress/completed" page would be needed. Those comments under the same subreddit could be saved under the same text file to avoid needless clutter.

Download at most [xxx] of the [best|top|new|old|contreverisal|Q&A] comments

I understand wanting to download the children of a comment, but downloading the whole comments of a submission is too much. You'd be better off using a website downloader like HTTrack, wget or something else - just feed it a list of links. I believe RMD could provide that list easily.

Unknow0059 avatar May 15 '20 05:05 Unknow0059