python-anticaptcha
python-anticaptcha copied to clipboard
Passing Token
I am trying to use the library on captchas that I get with Google-Scholar when trying to get citing papers for a source. A typical URL looks like
https://scholar.google.com/scholar?cites=12685256029779217548&as_sdt=2005&sciodt=0,5&hl=en
which if fetched with python sometimes produces a captcha. The HTML code of the captcha site contains the following tags, which seem to be relevant for the use of the anticaptcha library:
<script>
function gs_captcha_cb(){grecaptcha.render("gs_captcha_c",{"sitekey":"6LfFDwUTAAAAAIyC8IeC3aGLqVpvrB6ZpkfmAibj","callback":function(){document.getElementById("gs_captcha_f").submit()}});};
</script>
<form method="get" id="gs_captcha_f">
<h1>Please show you're not a robot</h1>
<div id="gs_captcha_c"></div>
<script src="//www.google.com/recaptcha/api.js?onload=gs_captcha_cb&render=explicit&hl=en" async defer></script>
<input type=hidden name="hl" value="en">
<input type=hidden name="as_sdt" value="0,5">
<input type=hidden name="sciodt" value="0,5">
<input type=hidden name="cites" value="12685256029779217548">
<input type=hidden name="scipsc" value="">
</form>
I had a look at recaptcha_selenium.py. However, the above HTML code does not contain the function onSuccess()
and my attempts to construct another function call such as
driver.execute_script("document.getElementById('gs_captcha_f').submit({})';".format(token))
did not yield anything.
Is there a way to deal with the situation above using the anticaptcha library?
I can not reproduce captcha challenge. Could you verify result when you adapt callback sniffer (see https://github.com/ad-m/python-anticaptcha/blob/master/examples/recaptcha_selenium_callback.py )?
I can not reproduce captcha challenge.
The problem is that the captcha only appears after several requests of the above type. Hence, it is hard to reproduce.
Could you verify result when you adapt callback sniffer (see https://github.com/ad-m/python-anticaptcha/blob/master/examples/recaptcha_selenium_callback.py )?
I am not sure how to adapt the example. If I interpret the code correctly, you are passing the token twice. The first time by setting the content of g-recaptcha-response
in
driver.execute_script("document.getElementById('g-recaptcha-response').innerHTML='{}';".format(token))
and the second time by calling
driver.execute_script("grecaptcha.recaptchaCallback[0]('{}')".format(token))
The problem is that the page that I am getting has no element g-recaptcha-response
and when I execute the second line I get the error
selenium.common.exceptions.JavascriptException: Message: javascript error: Cannot read property '0' of undefined
I guess the object grecaptcha
is called different in my case?
If I just execute the first comment (setting the response) and then submit the form by calling
driver.execute_script("document.getElementById('gs_captcha_f').submit()';")
I get the error
selenium.common.exceptions.JavascriptException: Message: javascript error: Invalid or unexpected token
I tried to get a reproducible captcha and came up with the following request
https://scholar.google.com/scholar?cites=12685256029779217548&as_sdt=2005&sciodt=0,5&hl=en&num=20
The argument num=20
produces a captcha for every call embedded in a site with a slightly different code than the captures I was facing before. However, if I could solve this, it would maybe be a start.
I tried adapting the code from recaptcha_selenium_callback.py and ended up with the following code
from selenium.webdriver.chrome.options import Options
from python_anticaptcha import AnticaptchaClient, NoCaptchaTaskProxylessTask
request = 'https://scholar.google.com/scholar?cites=12685256029779217548&as_sdt=2005&sciodt=0,5&hl=en&num=20'
options = Options()
driver = Chrome(chrome_options=options)
driver.get(request)
api_key = '...'
site_key = '6LfwuyUTAAAAAOAmoS0fdqijC2PbbdH4kjq62Y1b'
client = AnticaptchaClient(api_key)
task = NoCaptchaTaskProxylessTask(request, site_key)
job = client.createTask(task)
job.join()
token = job.get_solution_response()
driver.execute_script(
"document.getElementById('g-recaptcha-response').innerHTML='{}';".format(token)
)
driver.execute_script("submitCallback('{}')".format(token))
result = driver.page_source
The code runs without any errors. However, the display in the browser window does not change and also the variable result still contains the captcha page.
Where did I go wrong?
@davidwozabal , could you provide code to reproduce captcha challenge? I do not receive the captcha challenge at the address provided. If I receive such a code - I will be able to analyze the problem more effectively.
The link
https://scholar.google.com/scholar?cites=12685256029779217548&as_sdt=2005&sciodt=0,5&hl=en&num=20
above produces a captcha challenge for me (even if I open it from a normal browser from different computers).
please help me regarding this issue based on recaptcha
https://stackoverflow.com/questions/68877761/recaptcha-wasnt-solving-by-anticaptcha-plugin-in-selenium-python
Found the solution for the problem see #92