secutils icon indicating copy to clipboard operation
secutils copied to clipboard

Allow tracking the resources and content of web pages protected by WAF and CAPTCHA

Open azasypkin opened this issue 6 months ago • 0 comments

Summary

In the context of this issue, we should introduce a way for 'Web Scraping' utility to scrap pages protected by WAF and CAPTCHA. Here are a few pointers:

  • https://oxylabs.io/blog/playwright-bypass-captcha
  • https://oxylabs.io/products/captcha-proxies
  • https://www.npmjs.com/package/nocaptchaai-playwright
  • https://www.npmjs.com/package/@extra/recaptcha
  • https://nocaptchaai.com/plans#all
  • https://www.zenrows.com/blog/playwright-captcha#can-playwright-solve-captcha
  • https://2captcha.com/blog/how-to-use-2captcha-solver-extension-in-puppeteer
  • https://github.com/berstend/puppeteer-extra/blob/master/packages/puppeteer-extra-plugin-stealth/index.js

Also, consider switching to a new Chrome headless mode: https://github.com/microsoft/playwright/issues/21194#issuecomment-1444276676

azasypkin avatar Dec 26 '23 14:12 azasypkin