xbvr icon indicating copy to clipboard operation
xbvr copied to clipboard

RealityLovers scraper POV vs Voyeur

Open theRealKLH opened this issue 4 years ago • 6 comments

There is currently no distinction between POV vids and Voyeur vids of the same scene.

Example - Blowjob Anniversary w/ Claudia Macc (https://realitylovers.com/vd/160944479/Blowjob-Anniversary)

There are actually a POV and a Voyeur video of this scene. But only a single match in xbvr.

grabbing the links on the page show

https://realitylovers.com/video/download?contentId=160944479&sceneId=160944512&type=&perspective=POV&device=BROWSER&platform=DESKTOP

https://realitylovers.com/video/download?contentId=160944479&sceneId=160944552&type=&perspective=VOYEUR&device=BROWSER&platform=DESKTOP

contentId is the same for both and is what xbvr is showing for "scene id" (realitylovers-160944479) but theres a unique sceneId for each video.

is it possible to update the scraper to account for both versions?

theRealKLH avatar Jul 17 '20 22:07 theRealKLH

I'm just going to a bit of info on this issue because this is difficult one and I didn't write this scraper. I might give it a try later if no else does it first, but I think the Go-pros would be better at fixing this one.

Currently the scraper uses their REST API to grab most of the scene info, because that's how the videos page works. They don't have normal pagination but a "load more" button to dynamically add scenes to the page. But even so scenes are only listed there once regardless if there are one or two versions.

To scrape both versions it would need to check for POV/Voyeur versions on the actual scene page. This might be possible though as there is a bit of inline javascript with a const sceneData = { ... } that has the information OP mentions of the POV and/or VOYEUR scene-ids, which are indeed (both) different from the ID given by the REST API (apparently that's a contentID and not a sceneID).

Here's a pastebin with two examples of the sceneData variable, one with both versions and one with only a POV version: https://pastebin.com/MPRbc2MW - pretty sure it's just a bit of JSON, which I also used in the SLR scraper. Grabbing the json content of that inline js variable and then adding either one or two versions to the database is the tricky part.

Note it also has duration in seconds for each version which is currently not saved for this site. Also I think it might be good to manually add a POV or Voyeur tag to these scenes to be filtered on in XBVR. Plus a migration is needed to clear the RealityLovers scenes since they would all get a different sceneID.

Aerowen avatar Jul 24 '20 21:07 Aerowen

Also @theRealKLH could you add the filenames for both versions of how it is downloaded of that scene? Because possible filenames are currently not stored for this site, but if it's a basic format we could perhaps derive filenames from the scene page. Might make matching the downloaded files in XBVR better.

Aerowen avatar Jul 24 '20 22:07 Aerowen

The few that I have are all renamed. but looking around I see names like "018_-_money_fucks_oculus_pov_dmain_180x180_3dh.mp4" for POV and the voyeur videos would substitute "_voy_" for "_pov_". I notice also that "realitylovers" is more than just that for instance it is also "maturereality.com" as well (and tsvirtualovers.com link is at bottom of webpage). Filename format appears to be the same across "sites" (# - title - platform - perspective - the rest).

Also.. I noticed after reading your response on the VRCONK/EvilEye issues, realitylovers is also on SLR, sporting about 200 additional videos that the realitylovers scraper isn't picking up. Example is "Sex after sex w/Kathy Anderson" (Maturereality.com). It's not scrapable and I can't find this scene on the realitylovers website, but it is on SLR (voyeur & POV).

Separate question.. could you scrape both the original site AND SLR without issue?

theRealKLH avatar Jul 25 '20 00:07 theRealKLH

Hmm I don't think those filenames would be really helpful even if we could generate them. I rename everything myself too because most studios don't give very good titles at all.

You're right about maturereality.com also being a part of realitylovers but it just redirects to their main site, I think it's just listed under the Mature category. And scenes on SLR has scenes from both just listed under one studio.

Currently 504 scenes on SLR, missing the most recent scene I think they get it delayed by a week or two. There are 507 on RealityLovers (266 pov + 241 voyeur) so it's very close and perhaps worth it to switch to SLR for scraping this site. But clearly some scenes are on SLR that can't be found on the original site... it's a bit of a mess as usual.

Separate question.. could you scrape both the original site AND SLR without issue?

It would probably be possible to scrape from both original and SLR, but you would end up with hundreds of scenes that are dupes and shown twice (or more with pov/voyeur versions) because each scraper would be treated as a unique website. Then again if both scrapers are included you could just choose to only enable one of the scrapers, original or SLR.

On the plus side, SLR does have the scenes properly tagged and titled as POV or Voyeur already. And it would be very easy to add it to the SLR scraper, while adjusting the original site scraper to include pov/voyeur versions might take some to time to get it working. But ideally if the original site is scrape-able that's preferred I think.

Aerowen avatar Jul 26 '20 23:07 Aerowen

You say total of 507 on RealityLovers site... I just realized the 200 vid discrepancy. The scraper isn't accounting for pov/voy as separate files... DUH!

theRealKLH avatar Jul 27 '20 07:07 theRealKLH

Yes that's issue really. They don't even list the scenes twice themselves, they just have a pov/voyeur toggle button on the scene page. Even the gallery has both pov/voyeur sets of images together as one. So even if we do a separate pov and voyeur version adding the correct gallery images to each version will be tricky if at all possible (or reliably for all scenes).

Aerowen avatar Jul 28 '20 03:07 Aerowen