secutils
secutils copied to clipboard
Allow tracking the resources and content of web pages protected by WAF and CAPTCHA
Summary
In the context of this issue, we should introduce a way for 'Web Scraping' utility to scrap pages protected by WAF and CAPTCHA. Here are a few pointers:
- https://oxylabs.io/blog/playwright-bypass-captcha
- https://oxylabs.io/products/captcha-proxies
- https://www.npmjs.com/package/nocaptchaai-playwright
- https://www.npmjs.com/package/@extra/recaptcha
- https://nocaptchaai.com/plans#all
- https://www.zenrows.com/blog/playwright-captcha#can-playwright-solve-captcha
- https://2captcha.com/blog/how-to-use-2captcha-solver-extension-in-puppeteer
- https://github.com/berstend/puppeteer-extra/blob/master/packages/puppeteer-extra-plugin-stealth/index.js
Also, consider switching to a new
Chrome headless mode: https://github.com/microsoft/playwright/issues/21194#issuecomment-1444276676