ripme icon indicating copy to clipboard operation
ripme copied to clipboard

implements basic support for #407 'gonewilder' functionality.

Open ghost opened this issue 8 years ago • 6 comments

When passing the -A flag to a url matching 'reddit.com/r' pattern, it will rip the submission authors content rather then the provided subreddit url. Content will be saved in the same format as calling a reddit.com/u url directly.

ghost avatar Jan 01 '17 02:01 ghost

Seems to work fairly well enough but I got a bunch of error logs in the beginning about not being able to rip some URL (the url had & in it so maybe that was part of the problem). I was about to cancel it but when I came back to the window it had started downloading the users so I guess things look good.

Would like to get that error situation figured out and add the UI discoverability stuff I mentioned before merging this in.

metaprime avatar Jan 01 '17 23:01 metaprime

Is there any heuristic in place for number of pages to go back before stopping, score threshold, etc? I think if you keep going you'll just get all posters from the past month, with the lowest-scoring posters last.

metaprime avatar Jan 02 '17 00:01 metaprime

Is there any heuristic in place for number of pages to go back before stopping, score threshold, etc? I think if you keep going you'll just get all posters from the past month, with the lowest-scoring posters last.

It uses the same logic as if you were to rip https://reddit.com/user/foo directly. IIRC the default sorting is by New posts. This means it will process user submissions in ascending order by date. It would be trivial to add a property to override this such as download.rip_author_sort with accepted values of (new|hot|top|controversial).

ghost avatar Jan 02 '17 02:01 ghost

Seems to work fairly well enough but I got a bunch of error logs in the beginning about not being able to rip some URL (the url had & in it so maybe that was part of the problem). I was about to cancel it but when I came back to the window it had started downloading the users so I guess things look good.

Can you give me an example? I suspect this is in the ripping engine itself and not with this PR. Either way i can fix it.

ghost avatar Jan 02 '17 02:01 ghost

Can you give me an example?

I'll get back to you on this. Don't have time to try it now.

metaprime avatar Jan 02 '17 02:01 metaprime

It uses the same logic as if you were to rip https://reddit.com/user/foo directly.

Should have specified, I meant how far back in the subreddit to go looking for new usernames?

First of all, it seems like it didn't actually rip the usernames in order of first appearance in order of subreddit/top by monthly, and also, it didn't seem to know when to stop.

I was trying to rip reddit.com/r/AsiansGoneWild and the list of folders I got (in increasing chronological order of last modified), before I killed it:

reddit_sub_asiansgonewild
reddit_user_Charmerer
reddit_user_virtualgeisha
reddit_user_Dollywinks
reddit_user_agirlnamedfred
reddit_user_juiciebootie
reddit_user_Ammieow
reddit_user_Zann89
reddit_user_iimaginati0n
reddit_user_trandinhh
reddit_user_mikayla_xxx
reddit_user_xxxpensivetastes
reddit_user_1rrationality
reddit_user_itsmydistraction
reddit_user_Hadaka-sachiko
reddit_user_20and4hours
reddit_user_milehighcowboy
reddit_user_teacuptoy
reddit_user_thaigamergirl
reddit_user_japanese_miya225
reddit_user_secretdownunder
reddit_user_dffg13
reddit_user_seijoubrat

The first 50 posts under top monthly included only some of those names, and should have also included:

koreankarma
berrynoms
ttean
milky_teaa
cutepillow
teamavocado
asiankittilover
lilcreamycat
fdaugirl
AsianExpress87
zann89
bbypocahontas
MaidTiffany
fun-sized-asian
anonimoose_
wastelandwench

metaprime avatar Jan 02 '17 03:01 metaprime