cf-clearance-scraper icon indicating copy to clipboard operation
cf-clearance-scraper copied to clipboard

Does not working with multi connection in docker

Open cod888 opened this issue 1 year ago • 8 comments

Description

new version has the same issue so open it again.

I observed an issue when running 3+ connections to the solver simultaneously. instance: 8cpu, 32gb ram, ubuntu 22.04, docker 'cf-clearance-scraper:latest'

Error: Nothing returns back. No timeouts etc.

There is no problem if running <3 at the same time. Or if it works on a host, not in docker.

Would you have any thoughts to share about it?

Full steps to reproduce the issue

run the latest docker image 'cf-clearance-scraper:latest' run 3+ threads to solve an url. for example https://nopecha.com/demo/cloudflare use 'waf-session' mode.

Issue Type

No response

Operating System

Linux

Do you use Docker?

Docker

cod888 avatar Oct 02 '24 07:10 cod888

Cloudflare may think that there is an attack when 3+ resolutions are attempted from the same ip address and stop it. Can you try using a proxy for each request? Probably caused by ip.

mdervisaygan avatar Oct 02 '24 07:10 mdervisaygan

I use different IPs for each thread FYI I used the same workflow but run the solver from source (run index.js, macos) and there is no issue with 7+ threads

cod888 avatar Oct 02 '24 07:10 cod888

I use different IPs for each thread FYI I used the same workflow but run the solver from source (run index.js, macos) and there is no issue with 7+ threads

It solved a little slow due to the proxy speed, but as you can see in the video, it can solve it without any problems. Are you using static ipv4? If you are using rotating, it will not solve. Puppeteer-proxy library is used. If each request returns a different ip, it will be blocked when solving.

If you are using static ipv4, your problem may be caused by macos. You can try on an Ubuntu server. I could not reproduce the problem.

https://github.com/user-attachments/assets/7aa21043-9642-4718-8626-cc1509f1ceab

mdervisaygan avatar Oct 02 '24 07:10 mdervisaygan

did you run the requests simultaniously? I use static ipv4 proxies and I know that rotation will not work. I run docker container on ubuntu. it works welll when I use 1 or 2 requests simultaniously but when I run 3 or more requests at the same time then I dont see any response from the solver. even more if i run 10 requests simultaniously then I've got 'Too Many Requests' so it looks like the previous sessisions are not dropped. but it should be dropped by timeout as it was in previouse version. (run with -e browserLimit=20 and -e timeOut=60000)

cod888 avatar Oct 02 '24 08:10 cod888

did you run the requests simultaniously? I use static ipv4 proxies and I know that rotation will not work. I run docker container on ubuntu. it works welll when I use 1 or 2 requests simultaniously but when I run 3 or more requests at the same time then I dont see any response from the solver. even more if i run 10 requests simultaniously then I've got 'Too Many Requests' so it looks like the previous sessisions are not dropped. but it should be dropped by timeout as it was in previouse version. (run with -e browserLimit=20 and -e timeOut=60000)

Yes, I tested it by sending 7 requests instantly. I can't help because I can't reproduce your problem. If you can provide a code example that I can reproduce in the future, I can solve it. I will update about timeout in a little while.

mdervisaygan avatar Oct 02 '24 08:10 mdervisaygan

btw, your code from the example you provided in the video is not run simultaneously. it's run one by one. to make it run sumultaniusly you can change it to the following:

const proxy = require('./proxy.json')

console.log('Proxy length:',proxy.length);

async function test() {
    const promises = proxy.map(item =>
        fetch('http://localhost:3000/cf-clearance-scraper', {
            method: 'POST',
            headers: {
                'Content-Type': 'application/json',
            },
            body: JSON.stringify({
                url: "https://nopecha.com/demo/cloudflare",
                mode: 'waf-session',
                proxy: {
                    host: item.host,
                    port: item.port,
                    username: item.username,
                    password: item.password,
                }
            }),
        }).then(res => res.json()).catch(err => { console.error(err); return null; })
    );

    const results = await Promise.all(promises);
    results.forEach(session => {
        if (session) console.log(session.code);
    });
}
test();

in this case the issue is reproducible

cod888 avatar Oct 02 '24 10:10 cod888

btw, your code from the example you provided in the video is not run simultaneously. it's run one by one. to make it run sumultaniusly you can change it to the following:

const proxy = require('./proxy.json')

console.log('Proxy length:',proxy.length);

async function test() {
    const promises = proxy.map(item =>
        fetch('http://localhost:3000/cf-clearance-scraper', {
            method: 'POST',
            headers: {
                'Content-Type': 'application/json',
            },
            body: JSON.stringify({
                url: "https://nopecha.com/demo/cloudflare",
                mode: 'waf-session',
                proxy: {
                    host: item.host,
                    port: item.port,
                    username: item.username,
                    password: item.password,
                }
            }),
        }).then(res => res.json()).catch(err => { console.error(err); return null; })
    );

    const results = await Promise.all(promises);
    results.forEach(session => {
        if (session) console.log(session.code);
    });
}
test();

in this case the issue is reproducible

I didn't see await in the code I added to the readme file. I'm sorry. I will check this problem.

mdervisaygan avatar Oct 02 '24 13:10 mdervisaygan

I have got the same issue,i wanna run multiple thtreads,but its running one by one,maybe we need make mutiple threads on the servser side,hope big bro can fix this asap,thanks a lot!

WendyDReid1901 avatar Oct 08 '24 17:10 WendyDReid1901

image A page can produce multiple tabs, and a context can open multiple tabs

Java88888888 avatar Oct 15 '24 09:10 Java88888888

After executing the code, only this appeared in the console - { code: 500 }

const initCycleTLS = require('cycletls'); async function test() { const session = await fetch('http://localhost:3000/cf-clearance-scraper', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ url: 'https://nopecha.com/demo/cloudflare', mode: "waf-session", // proxy:{ // host: '127.0.0.1', // port: 3000, // username: 'username', // password: 'password' // } }) }).then(res => res.json()).catch(err => { console.error(err); return null });

if (!session || session.code != 200) return console.error(session);

const cycleTLS = await initCycleTLS();
const response = await cycleTLS('https://nopecha.com/demo/cloudflare', {
    body: '',
    ja3: '772,4865-4866-4867-49195-49199-49196-49200-52393-52392-49171-49172-156-157-47-53,23-27-65037-43-51-45-16-11-13-17513-5-18-65281-0-10-35,25497-29-23-24,0', // https://scrapfly.io/web-scraping-tools/ja3-fingerprint
    userAgent: session.headers["user-agent"],
    // proxy: 'http://username:[email protected]:443',
    headers: {
        ...session.headers,
        cookie: session.cookies.map(cookie => `${cookie.name}=${cookie.value}`).join('; ')
    }
}, 'get');

console.log(response.status);
cycleTLS.exit().catch(err => { });

} test()

schamen avatar Oct 18 '24 13:10 schamen

图像 一个页面可以产生多个标签页,一个上下文可以打开多个标签页 这样可以节省更多资源吧。

a361234599 avatar Oct 20 '24 03:10 a361234599

An update has been released for this issue. Please try with the latest version of the library. It should be resolved.

mdervisaygan avatar Feb 01 '25 23:02 mdervisaygan