puppeteer-extra icon indicating copy to clipboard operation
puppeteer-extra copied to clipboard

data-callback attribute on recaptcha element causing an error

Open bclougherty opened this issue 4 years ago • 10 comments

I'm trying to enter a captcha on a page that has this attr on the recaptcha element:

data-callback="imNotARobot"

This is causing an error in RecaptchaContentScript.enterRecaptchaSolutions when it calls eval(client.callback).call(), because on some page loads the function isn't defined. There should probably be a check after the eval and before the call, to make sure it's calling a real function.

bclougherty avatar Jan 02 '20 18:01 bclougherty

On further investigation, it seems to fail in a few different ways on different runs, but they all seem to be related to the RecaptchaContentScript.enterRecaptchaSolutions method. The code seems to be "swallowing" the error, so I don't even get anything useful back. The error that's thrown has a message that just says "Error: [object Object]".

bclougherty avatar Jan 02 '20 20:01 bclougherty

Interesting. But wouldn't that mean that solving the reCAPTCHA manually triggers the same issue?

I can definitely make that part (calling the defined callback function) more robust (and stringify the error object so it's readable).

berstend avatar Jan 05 '20 20:01 berstend

It's possible that that's not what's actually happening - I spent a long time digging into this, but because of the way that code is run, it seems to be hard to get much information out of it about exactly what's going on. It's possible that some javascript isn't loading correctly when I interact with the site through puppeteer, which might be the issue.

bclougherty avatar Jan 06 '20 14:01 bclougherty

Did you try running the code with debug output (explained in the readme)? Would be great if you can put a quick test case on plnkr.co or similar, then I'm better able to look into it as well (otherwise it's pretty time-consuming).

berstend avatar Jan 06 '20 15:01 berstend

I was having some issues getting debug to work correctly - I'll try again tomorrow.

Try this: http://run.plnkr.co/plunks/f9mTPb87V85EeLJ1dlzD/

bclougherty avatar Jan 06 '20 23:01 bclougherty

Thanks for providing the demo code :)

Regular reCAPTCHA is checking that those callbacks are correct:

     if (H = H.value,
      DI(window[H]))
        window[H]();
      else
        DI(H) ? H() : H && console.log("reCAPTCHA couldn't find user-provided function: " + H)

The plugin is currently also defensive, wrapping that part in a try/catch block:

          // Enter solution in optional callback
          if (client.callback) {
            try {
              if (typeof client.callback === 'function') {
                client.callback.call(window, solution.text)
              } else {
                eval(client.callback).call(window, solution.text) // tslint:disable-line
              }
              solved.responseCallback = true
            } catch (error) {
              solved.error = error
            }
          }

https://github.com/berstend/puppeteer-extra/blob/master/packages/puppeteer-extra-plugin-recaptcha/src/content.ts#L259-L268

So just looking at the code a broken callback function shouldn't cause issues (we're still placing the solution in the hidden input field before)? 🤔 We can still improve this though and not return an error if the callback function is invalid or cannot be found.

What is your issue actually, is the captcha not being solved? :)

berstend avatar Jan 07 '20 09:01 berstend

In terms of debug output, you'd want to set the DEBUG environment variable before starting your code, e.g.:

DEBUG=puppeteer-extra,puppeteer-extra-plugin:* node myscript.js

https://www.npmjs.com/package/debug

berstend avatar Jan 07 '20 09:01 berstend

The issue is that it's failing to fill the captcha solution - I get a solution back from the solver, but the plugin is throwing an error inside enterRecaptchaSolutions, which comes back to my code as "Error: [object Object]". I think the important thing is just to ensure that a usable error comes back to the client.

Here's my understanding of what's happening.

It looked like this call:

      this._generateContentScript('enterRecaptchaSolutions', {
        solutions
      })

was returning an array containing a single object containing the captcha solution, but that object had an error property that contained an empty object. Because an empty object is truthy in js, the find call in this line:

response.error = response.error || response.solved.find(s => !!s.error)

assigns that entire solution object to response.error, which is getting converted to a string by the throw.

bclougherty avatar Jan 07 '20 15:01 bclougherty

@berstend Hey, I just wanted to follow up on this, it's been a couple of weeks since I've heard anything - have you had any time to investigate this issue?

bclougherty avatar Jan 24 '20 14:01 bclougherty

@berstend Can I get an update on this ticket?

bclougherty avatar Mar 23 '20 14:03 bclougherty

Hi @berstend is there any update on this? It is still happening.

AnthonyLzq avatar Nov 28 '22 20:11 AnthonyLzq