puppeteer-extra icon indicating copy to clipboard operation
puppeteer-extra copied to clipboard

[Feature] Resolve funcaptcha with puppeteer-extra-plugin-recaptcha

Open sanchobouillant opened this issue 3 years ago • 56 comments

Hi,

puppeteer-extra-plugin-recaptcha is an amazing plugin. It should be great if it could resolve others captcha like FunCapcha?!

sanchobouillant avatar Dec 07 '20 07:12 sanchobouillant

@sanchobouillant Thanks! I have Funcaptcha/Arkose labs on my todo list. :-)

I already did some testing on the Microsoft signup version (which seems a bit customized, e.g. window.fc_fp is missing).

Feel free to post links to Funcaptcha implementations here, the more references I have the more robust I can make the detection.

image

berstend avatar Dec 07 '20 07:12 berstend

For future reference: https://client-demo.arkoselabs.com/solo-animals

berstend avatar Dec 07 '20 07:12 berstend

LinkedIn use FunCaptcha some time with no customization.

In 2captcha's documentation, they say I should find a Public key and a Service URL (surl). In Arkoselabs I can't find them. Do you more informations about how I could resolve them with the API?

sanchobouillant avatar Dec 09 '20 05:12 sanchobouillant

@sanchobouillant I haven't done a deep dive into this yet, I need to finish other things first. :-) The goal of the recaptcha plugin is to work on any site without configuration, which means finding the most robust way to extract the captcha data you mentioned regardless of how the site implemented it.

After a cursory look it seems as we cannot rely on a "god object" in the window space as e.g. reCAPTCHA has but need to find different ways.

berstend avatar Dec 09 '20 13:12 berstend

@berstend love this plugin! also looking for a funcaptcha solution. Here is a link i tried to solve manualy but i cant figure out how the callback works and where to set the funcaptcha solution. https://us.battle.net/account/creation/flow/create-full Captcha should come after this step (on pupeteer): image

i appreciate your work!

MaDetho avatar Dec 28 '20 17:12 MaDetho

2captcha can solve this but I can't figure out the form to submit back. It calls a POST to a verify page but the instantiation seems to happen by Javascript and in chrome dev tools it's unclear how to submit the form...

a10kiloham avatar Apr 06 '21 12:04 a10kiloham

var arkose_surl = ""; async function responseHandlerArkose(response) { let url = await response.url(); let url_surl_match_arr = url.match(/.&surl=(.)/); if (url_surl_match_arr && url_surl_match_arr[1]){ arkose_surl = url_surl_match_arr[1]; logger.warn("Surl found ${arkose_surl}"); } else {logger.warn("no Surl found in ${url}");} }

page.on("response", async response => {
	try {
		var url = await response.url();
	} catch (err) {
		console.log("URL err ", err);
		return;
	}
	if (
		url.match(/.*linkedin-api.arkoselabs.*/) 
	) {
		//console.log("Sent to handler");
		await responseHandlerArkose(response);
		//console.log("Back from handler");
	}
	if (url.match(/www.linkedin.com\/checkpoint\/challenge\//)){
		arkose_return_url = url;
	}
	return response;
});

if (arkose_surl){ let captcha_url = http://2captcha.com/in.php?key=${captcha_key}&method=funcaptcha&json=1&publickey=${arkose_pk}&surl=${arkose_surl}&${arkose_return_url}&pageurl=https://www.linkedin.com/checkpoint/challenge/verify; let captcha_solver = await axios.get(captcha_url); if (captcha_solver.data.request){ let captcha_solved_url = http://2captcha.com/res.php?key=${captcha_key}&action=get&json=0&id=${captcha_solver.data.request}; logger.info(url Log ${captcha_solved_url}); let captcha_solved = await axios.get(captcha_solved_url); logger.info("Result from 2Captcha: ${JSON.stringify(captcha_solved.data)}"); if (JSON.stringify(captcha_solved.data).match(CAPCHA_NOT_READY)){ logger.info(CAPTCHA_NOT_READY - retrying in 15 seconds); await wait (15000); captcha_solved = await axios.get(captcha_solved_url); captcha_solved = captcha_solved.replace(OK|,""); logger.info("Result from 2Captcha: ${JSON.stringify(captcha_solved.data)}"); } } } await page.setRequestInterception(true);

page.on("request", interceptedRequest => { if (interceptedRequest.url().match(/checkpoint/challenge/verify/)){ logger.info("Found checkpoint challenge"); console.log("post data: ${interceptedRequest.postData()}"); let postData = interceptedRequest.postData(); postData = postData.replace(/captchaUserResponseToken=.*$/,"captchaUserResponseToken=${captcha_solved.data}"); console.log("postData ${JSON.stringify(postData)}"); interceptedRequest.continue( { postData: postData }); else interceptedRequest.continue(); } });`

logger.warn("Looking for form submit"); await page.$eval('form-selector', form => form.submit());" }

Formatting sucks but this should work if you get the idea.

a10kiloham avatar Apr 06 '21 15:04 a10kiloham

var arkose_surl = ""; async function responseHandlerArkose(response) { let url = await response.url(); let url_surl_match_arr = url.match(/.&surl=(.)/); if (url_surl_match_arr && url_surl_match_arr[1]){ arkose_surl = url_surl_match_arr[1]; logger.warn("Surl found ${arkose_surl}"); } else {logger.warn("no Surl found in ${url}");} }

page.on("response", async response => {
	try {
		var url = await response.url();
	} catch (err) {
		console.log("URL err ", err);
		return;
	}
	if (
		url.match(/.*linkedin-api.arkoselabs.*/) 
	) {
		//console.log("Sent to handler");
		await responseHandlerArkose(response);
		//console.log("Back from handler");
	}
	if (url.match(/www.linkedin.com\/checkpoint\/challenge\//)){
		arkose_return_url = url;
	}
	return response;
});

if (arkose_surl){ let captcha_url = http://2captcha.com/in.php?key=${captcha_key}&method=funcaptcha&json=1&publickey=${arkose_pk}&surl=${arkose_surl}&${arkose_return_url}&pageurl=https://www.linkedin.com/checkpoint/challenge/verify; let captcha_solver = await axios.get(captcha_url); if (captcha_solver.data.request){ let captcha_solved_url = http://2captcha.com/res.php?key=${captcha_key}&action=get&json=0&id=${captcha_solver.data.request}; logger.info(url Log ${captcha_solved_url}); let captcha_solved = await axios.get(captcha_solved_url); logger.info("Result from 2Captcha: ${JSON.stringify(captcha_solved.data)}"); if (JSON.stringify(captcha_solved.data).match(CAPCHA_NOT_READY)){ logger.info(CAPTCHA_NOT_READY - retrying in 15 seconds); await wait (15000); captcha_solved = await axios.get(captcha_solved_url); captcha_solved = captcha_solved.replace(OK|,""); logger.info("Result from 2Captcha: ${JSON.stringify(captcha_solved.data)}"); } } } await page.setRequestInterception(true);

page.on("request", interceptedRequest => { if (interceptedRequest.url().match(/checkpoint/challenge/verify/)){ logger.info("Found checkpoint challenge"); console.log("post data: ${interceptedRequest.postData()}"); let postData = interceptedRequest.postData(); postData = postData.replace(/captchaUserResponseToken=.*$/,"captchaUserResponseToken=${captcha_solved.data}"); console.log("postData ${JSON.stringify(postData)}"); interceptedRequest.continue( { postData: postData }); else interceptedRequest.continue(); } });`

logger.warn("Looking for form submit"); await page.$eval('form-selector', form => form.submit());" }

Formatting sucks but this should work if you get the idea.

Thanks but it doesn't say anything about submitting the form with javascript. I'd like to bypass a fun captcha on outlook registration page.

I did wrote a nodejs app that request 2captcha API, but it's too slow so the captcha get reloaded and the account isn't created.

Any ideas ?

KR3KZ avatar May 31 '21 13:05 KR3KZ

add longer wait times for the timeout. for the form at the end of the block here click the form element

						const form = await page.$(`form`);
						await form.evaluate(form => form.submit());

a10kiloham avatar May 31 '21 13:05 a10kiloham

add longer wait times for the timeout. for the form at the end of the block here click the form element

						const form = await page.$(`form`);
						await form.evaluate(form => form.submit());

The outlook registration page doesn't accept a simple submit. When submitting the form with your line of code, the page just simply reloads.

In order to work, it requires javascript process and functions call in order to validate the data and submit the form. This is where I'm stuck.

KR3KZ avatar May 31 '21 16:05 KR3KZ

Send a click event on the button and it should work OK just fix the intercept to insert the code in the right way

a10kiloham avatar May 31 '21 16:05 a10kiloham

Send a click event on the button and it should work OK just fix the intercept to insert the code in the right way

The button is in a canvas which is in a iframe.

So to debug purposes and because I was too lazy to found how to click this button, I just modify the request with the intercept. And before sending it I am manually clicking the button to send the init state, then wait few seconds, and the request is sent.

But it just bring be back to the home page without creating the acc. I must be missing something.

Did you achieve it yourself ?

KR3KZ avatar May 31 '21 19:05 KR3KZ

Yep worked fine for me. Try inspecting the request when you click submit on a pc in chrome debug console so you can replace the Auth token in the right place. I guess each site does it different and this was how linkedin implements.

a10kiloham avatar May 31 '21 19:05 a10kiloham

Yep worked fine for me. Try inspecting the request when you click submit on a pc in chrome debug console so you can replace the Auth token in the right place. I guess each site does it different and this was how linkedin implements.

Yup I did that.

1 : a request is sent when funcaptcha has loaded 2 : a request is sent when user click the verify btn 3 : a request is made each time the user validate an image rotation

So my guess is : the server is checking if the steps order is right to validate the account creation. And it's checking if image has been rotated by the user.

But as 2captcha send me back only one token for the funcaptcha, I can't simulate the validation of each images. (plus the format is different)

So I have no idea how to get it to work. Will wait for an ppt update lol

KR3KZ avatar May 31 '21 19:05 KR3KZ

Yes that's how it always works. This. Request interception should work fine if you manually inspect and just see what the parameters format is

a10kiloham avatar May 31 '21 19:05 a10kiloham

I got the idea. But I lack information. How would I got sending this request ? asqs

I don't know what these params are and where I can find them. This is a screen from the linkedin network. I got the same thing on outlook.

In the snipped you provided above, you only used the intercept to modify the request to /verify, but did you make the same for all of these /ca request ?

KR3KZ avatar May 31 '21 21:05 KR3KZ

Do the same thing on your outlook page and see what requests follow immediately after you solve the captcha

a10kiloham avatar May 31 '21 21:05 a10kiloham

There are 2 request thats follows immediatly after the captcha is resolved.

ReportClientEvent 3

{"pageApiId":201040,"clientDetails":[],"country":"FR","userAction":"Action_CompleteEnforcement,Action_ClientSideTelemetry","source":"UserAction","clientTelemetryData":{"category":"UserAction","pageName":"201040","eventInfo":{"timestamp":1622497035916,"enforcementSessionToken":"74960b556f7c8fb44.5167399705|r: eu-west-1|metabgclr=#ffffff|maintxtclr=#1B1B1B|mainbgclr=#ffffff|guitextcolor=#747474|metaiconclr=#757575|meta=7|lang=en|pk=B7D8911C-5CC8-A9A3-35B0-554ACEE604DA|at=40|ag=101|cdn_url=https://client-api.arkoselabs.com/cdn/fc|lurl=https://audio-eu-west-1.arkoselabs.com|surl=https://client-api.arkoselabs.com","appVersion":null,"networkType":null}},"cxhFunctionRes":null,"uiflvr":1001,"uaid":"79060eed16bc4b4391d2ee4985d72065","scid":100118,"hpgid":201040}

Then CreateAccount 4

{"RequestTimeStamp":"2021-05-31T21:37:15.918Z","MemberName":"[email protected]","CheckAvailStateMap":["[email protected]:undefined"],"EvictionWarningShown":[],"UpgradeFlowToken":{},"FirstName":"ds65ds4","LastName":"sd6f54sdf","MemberNameChangeCount":1,"MemberNameAvailableCount":1,"MemberNameUnavailableCount":0,"CipherValue":"MKQQ7 3zqGNxsELMuCwR5AaXxjU4PU6MkQxguIn3i 6MUFbUIdo2uW3yxYre46r7PifKvClJ148trs6LiW KOuujJ8irwzcvCT6bijAJR96kaSnuMYLw2FWmS8vSYOtkVQbu40JugPC8SJrUHzKOxN HzhzM826mEG7MWDXNa/F37XMqFRRJ6VIjjP8S 1ggnChwZD/9smMvIABr0RuSEVC Rrwf/VzW6TJ9vgvZnzraTEZLELoxWxHaoQ55bGcCvy5VBksMHQxuyfYamKhTpfWRSMEjOVUOvFhfcByM5h4v937vIStpCo7eHiXCeTbNf1rUbbDGcZJ/xQeYAMo9CA: =","SKI":"4B8F32B06B3633468A617C4D5781E6B301099447","BirthDate":"16:02:1996","Country":"FR","IsOptOutEmailDefault":false,"IsOptOutEmailShown":true,"IsOptOutEmail":false,"LW":true,"SiteId":"68692","IsRDM":0,"WReply":null,"ReturnUrl":null,"SignupReturnUrl":null,"uiflvr":1001,"uaid":"79060eed16bc4b4391d2ee4985d72065","SuggestedAccountType":"EASI","SuggestionType":"Prefer","HFId":"13937213e7214b07800293f984599bbe","HType":"enforcement","HSol":"74960b556f7c8fb44.5167399705|r=eu-west-1|metabgclr=#ffffff|maintxtclr=#1B1B1B|mainbgclr=#ffffff|guitextcolor=#747474|metaiconclr=#757575|meta=7|lang=en|pk=B7D8911C-5CC8-A9A3-35B0-554ACEE604DA|at=40|ag=101|cdn_url=https://client-api.arkoselabs.com/cdn/fc|lurl=https://audio-eu-west-1.arkoselabs.com|surl=https://client-api.arkoselabs.com","HPId":"B7D8911C-5CC8-A9A3-35B0-554ACEE604DA","encAttemptToken":"fbYjOD7o3MSRVXSMKQxccbZMO8QbURZ4EFUy8jwqcNBORq9XxrwDT/GCqvF1mdRnlUrJRL/Uiq7K2N2z9fIPZW4iemBuKlSIFzSydUKoxkFuQYjcVVpyNOdFXH25oohODeiZ2FUhPhaHL87NDvvlnhip ylLfstnT7FkosSH880=:2:3","dfpRequestId":"","scid":100118,"hpgid":201040}

KR3KZ avatar May 31 '21 21:05 KR3KZ

looks like you're pretty close. assuming you changed all the URL paths from the linkedin example you can use the intercept to set HSol to the returned value.

i think the postData is returned as string, so you'd have to convert to Json then change and reconvert.

let postData = interceptedRequest.postData();
postData = JSON.parse(postData);
postData.HSol = ${captcha_solved.data};
postData = JSON.stringify(postData);
// or try just postData.HSol = ${captcha_solved.data} as well as converting to JSON may not be needed
console.log("postData ${JSON.stringify(postData)}");
interceptedRequest.continue( { postData: postData });

Fiddle around with it a bit but you should be fine if you're using the right URL paths. It should intercept the form submit and plug in the solved data as expected.

a10kiloham avatar Jun 01 '21 12:06 a10kiloham

The request trigger should be if (interceptedRequest.url().match(/CreateAccount/)) or possibly ReportClientEvent See which request form puts the HSol line in it originally

a10kiloham avatar Jun 01 '21 12:06 a10kiloham

I did so many tests.

It is the CreateAccount that has the HSol line. BUT the CreateAccount request is not called until the ReportClientEvent is successfull.

And the ReportClientEvent can be successfull only if all of the ca/ requests were successfull session_token: 7260b643ddab7b67.2188331005 sid: eu-west-1 game_token: 5460b643e0ba6161.5604614205 guess: {"ct":"kq9q9vfyRC2DYdk37BXYgkQPyy9DNS/Av2et28sWBCc=","iv":"c33289e7e2b62a4331e720b057972b49","s":"90c1cf06430cc3b8"} analytics_tier: 40

I lack of ideas now. Some dude did it but they're all trying to sell the scripts.

I am close it is frustrating lmao

KR3KZ avatar Jun 01 '21 14:06 KR3KZ

keep tracing all the events until you see where the captcha solution is added in and then dynamically inject it there. the ca events will just happen once the form is triggered, so at some point when the solution is added in as an element then inject it there. If the ca events aren't successful then debug and see why they aren't succeeding. they should automatically succeed unless they need the solution dropped in.

a10kiloham avatar Jun 01 '21 14:06 a10kiloham

I noticed that the answer of 2captcha services is not always the same format, sometimes I have 3084f4a302b176cd7.96368058|r=ap-southeast-1|guitextcolor=%23FDD531|metabgclr=%23FFFFFF|metaiconclr=%23202122|meta=3|lang=en|pk=12AB34CD-56F7-AB8C-9D01-2EF3456789A0|cdn_url=https://cdn.funcaptcha.com/fc|surl=https://funcaptcha.com which is the basic format

And sometimes I got this one, which correspond to the format of the ca/ request.

scr1{"ct":"JzxnqzANziSCZaKHetlMq0u7KVIE0MNPXNUBHbWXHyJzwto+N71+toKyeSfSb2ClMWPSWGXY1pbYTbmvV9qcwmQKbmBHw/Ai+a9Su7seKL0jYLQN2ZxSw/ZuEDRHuz0M+xWFaCvHMGESGaLta8jfqxyYCM+SncAKKkR7Ihi/CzAC8MRHxLVPdMbrPC+Jv4EcTzuyoV3m1vZS12eGK8aZ8R9AkCcwUNUZsVTk8yph/55hIpbrWc36KhCM1IHjlxvSNsfRrEWSslucE9M9pW44pxBxaLQkwarfNDr12OVCegAn0MfHlupU/HN9U/r5zuwzeC8+L6qlIN3Dvy1iw5cxUQDR6HdMNQK1p2Jxy6DhGtg222Zi4nKAFnMJc2fvTpT5B7OsIFSrbBhpASSGCnjCwVZLrKm9lHrQiyr9elPveo6knB4LGTc/DFzb8IMpNKqAoU9H/AWwOnYgwtK+vlDy9AWlpS3kQd8TY59kq5Zdg7vcRt0J9HkVn43VLhwcVVTcDwqzWlBnpxs0D2kA47CBwQ==","iv":"6d8ac2ca4750e7cdcdd6d0f76aa9d63b","s":"9a264251c341365d"}

I did tracked the events with a legit registration. Here's the normal process : 1 2 3 4 5 6

Hope this will help someone who's better than me to solve this case

KR3KZ avatar Jun 01 '21 15:06 KR3KZ

that's weird - the CDN url doesn't match between the servers (one is arkoselabs the other is funcaptcha)

the request is definitely here https://signup.live.com/API/CreateAccount?lcid=1033&wa=wsignin1.0&rpsnv=13&ct=1622563376&rver=7.0.6737.0&wp=MBI_SSL&wreply=https%3a%2f%2foutlook.live.com%2fowa%2f%3fnlp%3d1%26signup%3d1%26RpsCsrfState%3d5b4eccdc-6610-aa60-47ed-196e48c17b32&id=292841&CBCXT=out&lw=1&fl=dob%2cflname%2cwld&cobrandid=90015&lic=1&uaid=7c00af9193944251be4660bec8373c78

you can see the request POST adds in the HSol. I think you're possibly passing the wrong data to 2captcha. You need to parse out the surl and input that to 2captcha

let captcha_url = http://2captcha.com/in.php?key=${captcha_key}&method=funcaptcha&json=1&publickey=${arkose_pk}&surl=${arkose_surl}&${arkose_return_url}&pageurl=https://signup.live.com/API/CreateAccount;

a10kiloham avatar Jun 01 '21 16:06 a10kiloham

I pasted the first format from the 2captcha doc.

But I didnt had the &pageurl on https://signup.live.com/API/CreateAccount

But still not working, will try more things and will get back to you x_x

KR3KZ avatar Jun 01 '21 16:06 KR3KZ

your surl and such seemed to not match the screen caps, so worth a look. the /ca calls are to arkose and you don't need to do anything w/ them b/c you're bypassing arkose. it's just that one page. it's possible they're adding a checksum or something in the cipher of that post, but unclear to me at a glance. good luck!

a10kiloham avatar Jun 01 '21 16:06 a10kiloham

@KR3KZ How you solved your problem with outlook? I'm with the exact same problem, i don't know how to submit the captcha when the page do not have a proper button to it.

SrMilton avatar Feb 24 '22 18:02 SrMilton

@KR3KZ How you solved your problem with outlook? I'm with the exact same problem, i don't know how to submit the captcha when the page do not have a proper button to it.

did you find out?

savaonepunch avatar May 01 '22 17:05 savaonepunch

The submit should still be via a form element and just triggered by the Javascript. Use chrome dev tools to see the flow of what's called

On Sun May 1, 2022, 05:40 PM GMT, Sava @.***> wrote:

@KR3KZ https://github.com/KR3KZ How you solved your problem with outlook? I'm with the exact same problem, i don't know how to submit the captcha when the page do not have a proper button to it.

did you find out?

a10kiloham avatar May 01 '22 17:05 a10kiloham

Submit form

I found that the iframe#enforcementFrame will do the init funcaptcha and call this javascript after the captcha is resolved:

parent.postMessage(JSON.stringify({
    eventId: "challenge-complete",
    payload: {
        sessionToken: $token,
    }
}), "*")

That code will trigger the form submission and call the 2 api "ReportClientEvent" and "CreateAccount".

In addition to the "challenge-complete" event, the iframe#enforcementFrame also sends the "challenge-loaded", "challenge-shown", "challenge-iframeSize" events. I don't know what they mean.

The difference between tokens

571627c849851fc77.6194423204|r=ap-southeast-1|metabgclr=%23ffffff|maintxtclr=%231B1B1B|mainbgclr=%23ffffff|guitextcolor=%23747474|metaiconclr=%23757575|meta_height=325|meta=7|lang=en|pk=B7D8911C-5CC8-A9A3-35B0-554ACEE604DA|at=40|ag=101|cdn_url=https%3A%2F%2Fclient-api.arkoselabs.com%2Fcdn%2Ffc|lurl=https%3A%2F%2Faudio-ap-southeast-1.arkoselabs.com|surl=https%3A%2F%2Fclient-api.arkoselabs.com
649627c813eab5402.9868448504|r=ap-southeast-1|metabgclr=%23ffffff|maintxtclr=%231B1B1B|mainbgclr=%23ffffff|guitextcolor=%23747474|metaiconclr=%23757575|meta_height=325|meta=7|pk=B7D8911C-5CC8-A9A3-35B0-554ACEE604DA|at=40|ht=1|atp=2|cdn_url=https%3A%2F%2Fclient-api.arkoselabs.com%2Fcdn%2Ffc|lurl=https%3A%2F%2Faudio-ap-southeast-1.arkoselabs.com|surl=https%3A%2F%2Fclient-api.arkoselabs.com

Also when comparing the token of a valid request (line 1) and the token returned by 2captcha (line 2), I see a few differences.

  • The token of a valid request will have the attributes: lang=en, ag=101, 2captcha does not have those attributes.
  • Tokens of 2captcha have: ht=1|atp=2, valid tokens do not.

Hope these help us to find a solution soon.

imhuytq avatar May 12 '22 04:05 imhuytq