wayback icon indicating copy to clipboard operation
wayback copied to clipboard

Bypass Paywall and CAPTCHA

Open waybackarchiver opened this issue 3 years ago • 8 comments

Launch headfull browser with Xvfb and import extensions.

Relates to #92

Xvfb -ac :99 -screen 0 1280x1024x16 > /dev/null 2>&1 &
export DISPLAY=:99.0
chromium --headless=false --load-extension=path/to/extension

Extensions:

Misc:

waybackarchiver avatar Mar 04 '22 12:03 waybackarchiver

Okay move to this issue :)

I have two ideas.

  1. Run puppeteer-extra-* with puppeteer, and dump the CDP messages (I did this before, easy but not graceful), and generate go code, but it's difficult to deal with random scripts, such as https://www.npmjs.com/package/puppeteer-extra-plugin-stealth
  2. Provide a node runtime in go side or the browser side, such as https://github.com/browserify/browserify

How do you think?

hellodword avatar Mar 04 '22 12:03 hellodword

dump the CDP messages and generate go code

This approach may complicate and add uncertainty to the situation.

Provide a node runtime in go side or the browser side

As expected, this approach provides a new extension to call the methods exposed by puppeteer-extra-*.

waybackarchiver avatar Mar 04 '22 13:03 waybackarchiver

dump the CDP messages and generate go code

This approach may complicate and add uncertainty to the situation.

Right, such as https://github.com/kkoooqq/fakebrowser/blob/586e85c0ed872513d2e0703d8c516250a8a4365b/src/core/DeviceDescriptor.ts#L463-L479

hellodword avatar Mar 04 '22 13:03 hellodword

Provide a node runtime in go side or the browser side

As expected, this approach provides a new extension to call the methods exposed by puppeteer-extra-*.

But this is complicate too, ESM and CJS, js and ts, dependencies, and so on. It sounds like a webpack in browser.

hellodword avatar Mar 04 '22 14:03 hellodword

If the method on callsite or by extracting puppeteer-extra does not work, we can search for alternative extensions or create one.

I'm working for launching Chrome and loading extensions. Next, make it possible to customize it so that it can load more extensions.

waybackarchiver avatar Mar 04 '22 15:03 waybackarchiver

Related project wabarc/starter, and more details see runs

image

waybackarchiver avatar Mar 06 '22 15:03 waybackarchiver

Relates to wabarc/screenshot#11

waybackarchiver avatar Mar 11 '22 15:03 waybackarchiver

The extension bypass-paywall is currently supported in conjunction with the on-heroku project, and the next step will be to make the starter extra approachable and to add more extensions to it.

Unfortunately, the incapability to save PDFs ~~and screenshots~~ in the X11 environment has arisen, which means that core idea may not be completely operational.

waybackarchiver avatar Mar 13 '22 12:03 waybackarchiver