CommunityScrapers icon indicating copy to clipboard operation
CommunityScrapers copied to clipboard

Blocked Scrapers

Open mmenanno opened this issue 4 years ago • 13 comments

MGStage

Draft PR: None yet

Sites:

https://www.mgstage.com/

Why Is it blocked?:

You have the "Are you above 18 years old?" that block the page. This cookie needs to exist for the scraper to work:

What might unblock this?:

Either the ability to set a specific cookie in a scraper, or the ability to click a site button within the scraper.


Title

Draft PR:

Sites:

Why Is it blocked?:

What might unblock this?:

mmenanno avatar Jun 19 '20 02:06 mmenanno

bang.com and puretaboo can be unblocked with https://github.com/stashapp/stash/pull/625

bnkai avatar Jun 20 '20 23:06 bnkai

RealityKings.yml don't work anymore

Look like DigitalPlayground, Mofos and other from the same network, use javascript to load everything.

Belleyy avatar Jul 16 '20 19:07 Belleyy

~~### ReidMyLips/Nympho~~

~~PR: Local~~ ~~Sites: ~~ ~~- reidmylips.elxcomplete.com/updates/~~

~~Why Is it blocked?:~~ ~~The request is blocked by StackPath, part of the response you get:~~ https://i.imgur.com/I1RGSDz.png

~~What might unblock this?~~: ~~I don't know :thinking:~~

~~Edit: Added nympho.com~~

Edit: Fixed by adding the user agent. Thanks to bnkai !

Belleyy avatar Jul 31 '20 14:07 Belleyy

https://github.com/stashapp/CommunityScrapers/pull/117 adds Bang.com and PureTaboo scrapers since the CDP code was merged.

bnkai avatar Aug 04 '20 21:08 bnkai

Vixen Media Group seems to work with CDP , updated #1 with working version

bnkai avatar Aug 05 '20 23:08 bnkai

@halorrr Vixen Media Group , Bang and Puretaboo scrapers using CDP were merged so they can be removed/crossed off.

bnkai avatar Aug 07 '20 22:08 bnkai

MGStage

Sites:

  • https://www.mgstage.com/

Why Is it blocked?: You have the "Are you above 18 years old?" that block the page.

Belleyy avatar Aug 25 '20 16:08 Belleyy

@bnkai Did you ever end up revisiting the ability for the cdp scraper to inject a specific cookie? Looking at @Belleyy 's blocked scraper, it would be unblocked by creating this cookie: image

The cookie isn't generated until the agree button is actually pressed so this isn't one that just clicking the scrape button twice will fix.

mmenanno avatar Oct 09 '20 01:10 mmenanno

@halorrr yeah i came to the same conclusion. It's in my todo list, i want to finish the click PR first and i think we might not need the cookie setting as clicking on //a[@id="AC"] when given a url of something like https://www.mgstage.com/product/product_detail/326KJN-004/ seems to work. (with my PR build and a cdp test only scraper)

bnkai avatar Oct 09 '20 12:10 bnkai

jacquieetmicheltv seems to need a cookie for the disclaimer page and/or maybe something more since cloudflare gets involved. script scraper PR available now for jacquieetmicheltv

bnkai avatar Nov 04 '20 09:11 bnkai

@halorrr You could update the main post, MGStage scraper has been created with the cookie update in stash #301

Belleyy avatar Dec 28 '20 15:12 Belleyy

https://analized.com/ scraper broken AnalizedNetwork.yml url: analized.com/1/scene/

sample link https://analized.com/trailers/

fidofido300 avatar Nov 01 '21 16:11 fidofido300

https://analized.com/ scraper broken AnalizedNetwork.yml url: analized.com/1/scene/

sample link https://analized.com/trailers/

Did you try the latest version ? ( this scraper just got updated #755 ) https://github.com/stashapp/CommunityScrapers/blob/master/scrapers/ThirdRockEnt.yml

Belleyy avatar Nov 01 '21 16:11 Belleyy

Closing this as it's been unused for years

Maista6969 avatar May 02 '24 06:05 Maista6969