cloudflare-scrape
cloudflare-scrape copied to clipboard
I can't break it.Help me PLZ
Before creating an issue, first upgrade cfscrape with pip install -U cfscrape and see if you're still experiencing the problem. Please also confirm your Node version (node --version or nodejs --version) is version 10 or higher.
Make sure the website you're having issues with is actually using anti-bot protection by Cloudflare and not a competitor like Imperva Incapsula or Sucuri. And if you're using an anonymizing proxy, a VPN, or Tor, Cloudflare often flags those IPs and may block you or present you with a captcha as a result.
Please confirm the following statements and check the boxes before creating an issue:
- [yes] I've upgraded cfscrape with
pip install -U cfscrape - [yes] I'm using Node version 10 or higher
- [yes] The site protection I'm having issues with is from Cloudflare
- [yes] I'm not using Tor, a VPN, or an anonymizing proxy
Python version number
Run python --version and paste the output below:
Python 3.7.4
cfscrape version number
Run pip show cfscrape and paste the output below:
Name: cfscrape
Version: 2.1.1
Summary: A simple Python module to bypass Cloudflare's anti-bot page. See https://github.com/Anorov/cloudflare-scrape for more information.
Home-page: https://github.com/Anorov/cloudflare-scrape
Author: Anorov
Author-email: [email protected]
License: UNKNOWN
Location: /Users/seven/opt/anaconda3/lib/python3.7/site-packages
Requires: requests
Required-by:
Code snippet involved with the issue
import cfscrape
scraper = cfscrape.create_scraper()
scraper = cfscrape.create_scraper(delay=10)
web_data = scraper.get("https://sci-hub.tf/10.1109/MCSE.2007.58").content
print(web_data)
Complete exception and traceback
(If the problem doesn't involve an exception being raised, leave this blank)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
~/opt/anaconda3/lib/python3.7/site-packages/cfscrape/__init__.py in solve_challenge(self, body, domain)
254 r"(?:[^{<>]*},\s*(\d{4,}))?",
--> 255 javascript, flags=re.S
256 ).groups()
AttributeError: 'NoneType' object has no attribute 'groups'
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
<ipython-input-4-0e4d26f3df49> in <module>
3 scraper = cfscrape.create_scraper()
4 scraper = cfscrape.create_scraper(delay=10)
----> 5 web_data = scraper.get("https://sci-hub.tf/10.1109/MCSE.2007.58").content
6 print(web_data)
~/opt/anaconda3/lib/python3.7/site-packages/requests/sessions.py in get(self, url, **kwargs)
544
545 kwargs.setdefault('allow_redirects', True)
--> 546 return self.request('GET', url, **kwargs)
547
548 def options(self, url, **kwargs):
~/opt/anaconda3/lib/python3.7/site-packages/cfscrape/__init__.py in request(self, method, url, *args, **kwargs)
127 # Check if Cloudflare anti-bot "I'm Under Attack Mode" is enabled
128 if self.is_cloudflare_iuam_challenge(resp):
--> 129 resp = self.solve_cf_challenge(resp, **kwargs)
130
131 return resp
~/opt/anaconda3/lib/python3.7/site-packages/cfscrape/__init__.py in solve_cf_challenge(self, resp, **original_kwargs)
202
203 # Solve the Javascript challenge
--> 204 answer, delay = self.solve_challenge(body, domain)
205 if method == 'POST':
206 cloudflare_kwargs["data"]["jschl_answer"] = answer
~/opt/anaconda3/lib/python3.7/site-packages/cfscrape/__init__.py in solve_challenge(self, body, domain)
290 raise ValueError(
291 "Unable to identify Cloudflare IUAM Javascript on website. %s"
--> 292 % BUG_REPORT
293 )
294
ValueError: Unable to identify Cloudflare IUAM Javascript on website. Cloudflare may have changed their technique, or there may be a bug in the script.
Please read https://github.com/Anorov/cloudflare-scrape#updates, then file a bug report at https://github.com/Anorov/cloudflare-scrape/issues."
URL of the Cloudflare-protected page
[https://sci-hub.tf/10.1109/MCSE.2007.58]
URL of Pastebin/Gist with HTML source of protected page
[https://sci-hub.tf/10.1109/MCSE.2007.58]
This project is abandoned and no longer functional, see #406