cloudflare-scrape
cloudflare-scrape copied to clipboard
ValueError: Unable to parse Cloudflare anti-bot IUAM page: list index out of range Cloudflare may have changed their technique, or there may be a bug in the script.
Before creating an issue, first upgrade cfscrape with pip install -U cfscrape
and see if you're still experiencing the problem. Please also confirm your Node version (node --version
or nodejs --version
) is version 10 or higher.
Make sure the website you're having issues with is actually using anti-bot protection by Cloudflare and not a competitor like Imperva Incapsula or Sucuri. And if you're using an anonymizing proxy, a VPN, or Tor, Cloudflare often flags those IPs and may block you or present you with a captcha as a result.
Please confirm the following statements and check the boxes before creating an issue:
- [ ] I've upgraded cfscrape with
pip install -U cfscrape
- [ ] I'm using Node version 10 or higher
- [ ] The site protection I'm having issues with is from Cloudflare
- [ ] I'm not using Tor, a VPN, or an anonymizing proxy
Python version number
Run python --version
and paste the output below:
Python 3.7.0
cfscrape version number
Run pip show cfscrape
and paste the output below:
Name: cfscrape
Version: 2.1.1
Summary: A simple Python module to bypass Cloudflare's anti-bot page. See https://github.com/Anorov/cloudflare-scrape for more information.
Home-page: https://github.com/Anorov/cloudflare-scrape
Author: Anorov
Author-email: [email protected]
License: UNKNOWN
Location: c:\python3\venv3\lib\site-packages
Requires: requests
Required-by:
Code snippet involved with the issue
import cfscrape proxies = { "http": "http://127.0.0.1:10809", "https": "https://127.0.0.1:10809", }
scraper = cfscrape.create_scraper()
web_data = scraper.get("http://www.javlibrary.com/cn/vl_genre.php?list&g=ki&mode=2&page=4", proxies=proxies).content print(web_data)
Complete exception and traceback
(If the problem doesn't involve an exception being raised, leave this blank) C:\python3\venv3\Scripts\python.exe C:/python3/venv3/javlibrary-spider-master/cfs.py Traceback (most recent call last): File "C:\python3\venv3\lib\site-packages\cfscrape_init_.py", line 174, in solve_cf_challenge cloudflare_kwargs["params"].update({param.split('=')[0]:param.split('=')[1]}) IndexError: list index out of range
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/python3/venv3/javlibrary-spider-master/cfs.py", line 12, in
Please read https://github.com/Anorov/cloudflare-scrape#updates, then file a bug report at https://github.com/Anorov/cloudflare-scrape/issues."
Process finished with exit code 1
URL of the Cloudflare-protected page
[LINK GOES HERE] http://www.javlibrary.com/cn/vl_genre.php?list&g=ki&mode=2&page=4
URL of Pastebin/Gist with HTML source of protected page
[LINK GOES HERE]
I'm also having the same problem as you are having
I'm also trying to scrape an anime site protected with cloudflare
@Anorov Please Help!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Have the same issue but on a widows machine the code runs without issues, on linux (ubuntu 20.10 and python 3.8) the issue is coming out. Most likely is an issue with web requests. I'm getting a HTTP 503 error from cloudflare. Any idea what it can be? Probably not waiting enough time and getting stuck on the cloudflare langing page?