puppeteer-extra
puppeteer-extra copied to clipboard
Cloudflare detecting pupeteer
I have not queried or clicked anything using puppeteer, simply connected to the browser seems enough for cloudflare to block access to a site.
I have used the simplest possible example in puppeteer with a real browser (no headless) and no automation scripts.
import puppeteer from 'puppeteer-extra'
import StealthPlugin from 'puppeteer-extra-plugin-stealth'
puppeteer.use(StealthPlugin())
;(async () => {
console.log('launching...')
const browser = await puppeteer.launch({
executablePath: 'C:/Program Files/Google/Chrome/Application/chrome.exe',
headless: false,
defaultViewport: null
})
console.log('connected')
const page = await browser.newPage()
await page.goto('https://nowsecure.nl')
console.log('waiting for 1 min...')
await new Promise((r) => setTimeout(r, 60_000))
console.log('closing...')
await browser.close()
})()
I have replicated this without puppeteer and clicking on the cloudflare verification button I pass through to the website, which means I suspect that somehow they are able to detect Puppeteer?
The video below shows manual clicking but cloudflare refuses access:
https://github.com/berstend/puppeteer-extra/assets/25906558/973501b3-25e5-40a4-98ad-888315930b4b
I have also replicated this on android, forwarding the port to chrome dev tools via ADB and connected to the debugging port and experience the same result.
For mobile, I:
- use ADB to forward the chrome dev tools port:
adb forward tcp:9000 localabstract:chrome_devtools_remote - Run the following script to connect with puppeteer
import { Browser, connect } from 'puppeteer-core'
let browser: Browser | null = null
const timer = (ms: number) => new Promise<null>((res) => setTimeout(() => res(null), ms))
export async function puppeteerConnect({
port,
queryTimeoutMs
}: {
port: string
queryTimeoutMs: number
}): Promise<Browser> {
const debuggerUrl = 'http://127.0.0.1:' + port + '/json/version'
const fetcher = async () => {
const result = await fetch(debuggerUrl)
return await result.text()
}
const result = await Promise.race([timer(queryTimeoutMs), fetcher()])
if (result === null) {
throw new Error('get debugger URL timed out')
}
const data = JSON.parse(result) as { webSocketDebuggerUrl?: unknown }
const wsUrl = data?.webSocketDebuggerUrl
if (typeof wsUrl !== 'string') {
throw new Error('get debugger url from response failed, `wsUrl` is not string')
}
// use socket url to connect to with puppeteer
const browser = await Promise.race([
connect({
browserWSEndpoint: wsUrl,
defaultViewport: null
}),
timer(queryTimeoutMs)
])
if (browser === null) {
throw new Error('puppeteer connect timed out')
}
return browser
}
async function retryConnect() {
let lastErr: unknown = null
let i = 0
while (i < 20) {
console.log('connection attempt #', i)
try {
return await puppeteerConnect({ port: '9000', queryTimeoutMs: 500 })
} catch (err) {
lastErr = err
}
await new Promise((r) => setTimeout(r, 1000))
i += 1
}
throw lastErr
}
;(async () => {
console.log('connecting...')
const _browser = await retryConnect()
console.log('connected!')
browser = _browser
const pages = await browser.pages()
const firstPage = pages[0]
if (!firstPage) {
throw new Error('NO PAGE')
}
await firstPage.goto('https://nowsecure.nl')
await new Promise((r) => setTimeout(r, 60_000))
})().finally(() => {
console.log('browser disconnecting')
browser?.disconnect()
console.log('should be done?')
})
Try using the start-up tab and see if it works. We have more info on this problem here: https://github.com/berstend/puppeteer-extra/issues/832
I am now recently (within last two weeks) seeing the exact same thing. Using the start-up tab doesn't seem to make a difference.
@krkeegan @joeledwardson @NodePuppeteer @peterblazejewicz @bclougherty #832
I am now recently (within last two weeks) seeing the exact same thing. Using the start-up tab doesn't seem to make a difference.
I had luck up until now. Now, anything that is protected by Cloudflare, simply doesn't let me do anything... even if I solve captcha myself... it continues spinning, or reporting that I've failed to pass the test as human being.
Is there anyone that had luck resolving this issue?
I am now recently (within last two weeks) seeing the exact same thing. Using the start-up tab doesn't seem to make a difference.
I had luck up until now. Now, anything that is protected by Cloudflare, simply doesn't let me do anything... even if I solve captcha myself... it continues spinning, or reporting that I've failed to pass the test as human being.
Is there anyone that had luck resolving this issue?
https://medium.com/@zfcsoftware/how-to-bypass-cloudflare-with-node-js-869fa6e21dd5
Friend, your article is absolutely wrong... You completely do not understand the cause of this issue. Please stop spamming these threads.
Friend, your article is absolutely wrong... You completely do not understand the cause of this issue. Please stop spamming these threads.
The article is about passing Cloudflare. 2 pieces of code are given. Both can easily pass including the corporate plan. Which part is wrong? I am trying to convey a source because they constantly say that we cannot pass Cloudflare. Explain the wrong part and let's learn together. Also, I'm not spamming. My first message was to link a github discussion. It has nothing to do with me and there are dozens of people in that discussion. I am waiting for you to explain what is wrong.
i had this issue, some website have more advanced scraper detection. The solution was to use a proxy residential service like brightdata, and pass the proxy args to pupeteer.
const BROWSER_CONFIG: PuppeteerLaunchOptions = {
headless: 'new',
defaultViewport: null,
ignoreHTTPSErrors: true,
args: ['--proxy-server=xxxx:xxxx'],
};
const browser = await puppeteer.launch(BROWSER_CONFIG);
const page = (await browser.pages())[0];
await page.authenticate({
username: 'xxxxx',
password: 'xxxxxx',
});
zfcsoftware
bruh the method this blog introduced not work for me
zfcsoftware
bruh the method this blog introduced not work for me
You can test puppeteer-real-browser with the latest version. You should not have any problems, it has just been updated. If you are using Linux, I recommend running it with Docker.
Windows Server Test: https://github.com/user-attachments/assets/b1c4dca1-db48-4692-ac67-fc399d11e009
Ubuntu 24 test: https://github.com/user-attachments/assets/b1040e6a-9d8d-4fed-910a-52cabbd82130