Enhancement: Support for Modern Cloudflare Challenge Format
Enhancement: Support for Modern Cloudflare Challenge Format
Summary
My API library for prosportstransactions.com stopped working because of Cloudflare protection. I integrated cloudscraper v3.0.0 hoping to solve the problem. Unfortunately, I was unsuccessful in my attempt. This led me to conduct extensive research on modern Cloudflare challenges and develop enhanced V3 handler improvements that could benefit the cloudscraper project.
Key Finding: My enhanced V3 handler successfully detects and parses new window._cf_chl_opt challenge structures, but prosportstransactions.com (and similar advanced sites) remain inaccessible due to transport-layer TLS fingerprint detection.
While my work did not solve the core TLS fingerprinting problem that blocks advanced Cloudflare protection sites like prosportstransactions.com, I identified significant improvements to challenge detection and handling that would enhance cloudscraper's compatibility with modern challenge formats.
Research Context
Target Site: prosportstransactions.com (advanced Cloudflare protection) Research Branch: https://github.com/rsforbes/pro_sports_transactions/tree/feature/cloudflare-bypass-research Cloudscraper Version: v3.0.0 (from tag, not PyPI) Test Methodology: 12 systematic configurations including:
- Standard v3.0.0 base implementation
- PR #295 testing (session management improvements)
- PR #283 testing (additional headers and browser fingerprinting)
- Custom patched V3 handler for modern challenge detection
Installation Used:
cloudscraper = {git = "https://github.com/VeNoMouS/cloudscraper.git", rev = "refs/pull/295/head"}
Key Findings
New Challenge Format Discovered
Modern Cloudflare challenges now use a different JavaScript structure that current cloudscraper doesn't detect:
Traditional format (currently detected):
window._cf_chl_ctx = {...};
Modern format (not detected):
window._cf_chl_opt = {
cvId: '3',
cZone: 'prosportstransactions.com',
cType: 'managed',
cRay: '956ddeea7c0960a1',
cH: 'R7tFH...',
cUPMDTk: 'R7tFH...',
cFPWv: 'b',
cITimeS: '1751119949',
fa: '/cdn-cgi/challenge-platform/h/b/g/orchestrate/managed/v1?...',
md: 'MhKFh...',
mdrd: 'cOM9...'
};
Updated URL Patterns
Traditional: /cdn-cgi/challenge-platform/h/b/
Modern: /cdn-cgi/challenge-platform/h/b/jsd/r/{complex_identifier}/{ray_id}
Where {complex_identifier} is extracted from:
__CF$cv$params = {
r: '0.01313896590161113:1751120168:uuGQcGrMYKAbiU7S-5nWc8aWLxMzIT5mqxDn71u5s1Q',
t: 'MTc1MTEyMDkwNy4wMDAwMDA='
};
Payload Format Changes
Traditional: Form data (application/x-www-form-urlencoded)
Modern: JSON payload (text/plain;charset=UTF-8)
Proposed Enhancements
1. Enhanced Challenge Detection
Update is_V3_Challenge() to detect modern format:
@staticmethod
def is_V3_Challenge(resp):
try:
return (
resp.headers.get("Server", "").startswith("cloudflare")
and resp.status_code in [403, 429, 503]
and (
# Existing patterns...
re.search(r"""cpo\.src\s*=\s*['\"]/cdn-cgi/challenge-platform/\S+orchestrate/jsch/v3""", resp.text, re.M | re.S)
or re.search(r"window\._cf_chl_ctx\s*=", resp.text, re.M | re.S)
or re.search(r'<form[^>]*id="challenge-form"[^>]*action="[^"]*__cf_chl_rt_tk=', resp.text, re.M | re.S)
or
# NEW: Modern challenge format detection
(
"Just a moment" in resp.text
and "/challenge-platform/" in resp.text
and re.search(r"window\._cf_chl_opt\s*=", resp.text)
and resp.headers.get("cf-mitigated") == "challenge"
)
)
)
except AttributeError:
pass
return False
2. JavaScript Object Parser
Many modern challenges use JavaScript object notation instead of JSON:
def parse_js_object_manually(self, js_obj_str):
"""Manually parse JavaScript object when JSON parsing fails"""
try:
data = {}
patterns = [
(r"cvId:\s*'([^']+)'", "cvId"),
(r'cZone:\s*"([^"]+)"', "cZone"),
(r"cType:\s*'([^']+)'", "cType"),
(r"cRay:\s*'([^']+)'", "cRay"),
(r'cH:\s*"([^"]+)"', "cH"),
(r'cUPMDTk:\s*"([^"]+)"', "cUPMDTk"),
(r"cFPWv:\s*'([^']+)'", "cFPWv"),
(r"cITimeS:\s*'([^']+)'", "cITimeS"),
]
for pattern, key in patterns:
match = re.search(pattern, js_obj_str)
if match:
data[key] = match.group(1)
return data
except Exception:
return {}
3. Complex URL Construction
Support for modern challenge URL patterns:
# Extract complex identifier from __CF$cv$params
cf_params_match = re.search(
r'__CF\$cv\$params\s*=\s*\{.*?r:\s*[\'"]([^\'"]+)[\'"]',
resp.text,
re.DOTALL,
)
if cf_params_match and "cRay" in opt_data:
r_param = cf_params_match.group(1)
ray_id = opt_data["cRay"]
form_action = f"/cdn-cgi/challenge-platform/h/b/jsd/r/{r_param}/{ray_id}"
4. JSON Payload Support
Handle modern JSON payloads instead of form data:
# For modern challenges, send JSON payload
if challenge_data.get("is_modern", False):
payload_data = {
"chctx": opt_data,
"answer": challenge_answer,
}
# Add specific fields from opt_data
for key in ["cvId", "cRay", "cType", "cZone", "cUPMDTk", "cFPWv", "cITimeS"]:
if key in opt_data:
payload_data[key] = opt_data[key]
# Use JSON content type
headers.update({
"Content-Type": "text/plain;charset=UTF-8",
"Accept": "*/*",
"Sec-Fetch-Dest": "empty",
"Sec-Fetch-Mode": "cors",
"Sec-Fetch-Site": "same-origin",
})
return json.dumps(payload_data)
Implementation Reference
My complete implementation is available in the research branch:
-
Enhanced V3 Handler:
temp_cloudscraper/cloudscraper/cloudflare_v3_patched.py(495 lines) -
Integration Code:
src/pro_sports_transactions/search.py(CloudscraperConfig class) -
Test Results:
docs/cloudscraper/cloudscraper-testing.md(12 systematic tests) -
Technical Analysis:
docs/cloudscraper/TECHNICAL_ANALYSIS.md
Test Results
I tested across multiple configurations with cloudscraper v3.0.0:
| Configuration | Base Version | Result | Key Finding |
|---|---|---|---|
| Standard v3.0.0 | Tag release | TIMEOUT | Basic implementation insufficient |
| PR #295 (Session) | Pull request | FAIL | Session improvements help but insufficient |
| PR #283 (Headers) | Pull request | TIMEOUT | Additional headers don't solve core issue |
| Custom V3 Patched | My enhancement | FAIL | Successfully detects modern challenges but blocked by TLS fingerprinting |
Despite these improvements, advanced sites like prosportstransactions.com still block requests due to TLS fingerprinting - the fundamental limitation that requires browser-level solutions. However, these enhancements would improve cloudscraper's compatibility with sites using modern challenge formats.
Key Discovery: My patched V3 handler successfully detected and parsed the modern challenge format, proving the enhancement works - the failure occurs at the TLS transport layer, not the challenge handling layer.
Benefits to Cloudscraper
- Broader Compatibility: Support sites using modern challenge format
- Future-Proofing: Handle evolving Cloudflare challenge patterns
- Better Parsing: Robust JavaScript object handling
- Modern Standards: JSON payload support for current challenges
Implementation Priority
High Priority: Challenge detection improvements (#1) Medium Priority: JavaScript object parser (#2) and URL construction (#3) Low Priority: JSON payload support (#4) - fewer sites use this format
Notes
- Implementation maintains backward compatibility with existing challenges
- TLS fingerprinting remains the primary obstacle for advanced protection sites - this is the core unsolved problem
Collaboration
While I haven't solved the TLS fingerprinting challenge, these improvements would benefit cloudscraper's compatibility with modern challenge formats. For my specific use case with prosportstransactions.com, I'll be exploring Playwright and curl_cffi to see if I can bypass the current TLS fingerprinting obstacle. If there's interest on addressing TLS fingerprinting within cloudscraper itself, I'd be happy to submit a PR for the above work.
Research Branch: https://github.com/rsforbes/pro_sports_transactions/tree/feature/cloudflare-bypass-research
Documentation: docs/cloudscraper/ directory contains complete technical analysis and test results
How about use https://github.com/lexiforest/curl_cffi to bypass TLS
@xAffan - I had issues with curl_cffi as well.
This appears to be a JA3/JA4 fingerprinting issue. Based on the example output and the response headers:
- 403 Forbidden with cf-mitigated: challenge - This shows Cloudflare detected the request as suspicious
- Accept-CH headers requesting browser characteristics - The response includes several Client Hints headers (Sec-CH-UA-, UA-) which are part of modern browser fingerprinting
- Cloudflare challenge page - The HTML response starts with "Just a moment..." which is Cloudflare's challenge page
JA3/JA4 fingerprinting analyzes the TLS handshake characteristics to identify the client. Cloudscraper uses older TLS fingerprinting evasion techniques that work with JA3, but JA4 is more sophisticated and includes:
- TLS cipher suite ordering
- TLS extensions
- HTTP/2 ALPN negotiation
- Client Hello packet structure
- Additional entropy from newer TLS 1.3 features
That prosportstransactions.com blocks cloudscraper immediately (403 status) rather than serving a JavaScript challenge suggests they're using JA4 or similar advanced fingerprinting to detect that the TLS handshake doesn't match a real browser, regardless of the HTTP headers cloudscraper sends.
Runnable Example:
#!/usr/bin/env python3
"""
Minimal example demonstrating cloudscraper with prosportstransactions.com
This example shows the current state of cloudscraper's ability to handle
the site's JA3/JA4 fingerprinting and Cloudflare challenges.
"""
import sys
import cloudscraper
def test_basic_request():
"""Test basic GET request to prosportstransactions.com"""
print("Testing cloudscraper with prosportstransactions.com...")
print("-" * 60)
# Create scraper instance
scraper = cloudscraper.create_scraper()
# Target URL
url = (
"https://www.prosportstransactions.com/basketball/Search/"
"SearchResults.php?Player=&Team=&BeginDate=&EndDate="
"&PlayerMovementChkBx=yes&submit=Search"
)
try:
# Make request
print(f"Requesting: {url}")
response = scraper.get(url, timeout=30)
# Print response details
print(f"\nStatus Code: {response.status_code}")
print(f"Headers: {dict(response.headers)}")
print(f"\nContent Length: {len(response.content)} bytes")
print("Content Preview (first 500 chars):")
print(response.text[:500])
# Check if we got blocked
if "Checking your browser" in response.text or response.status_code == 403:
print("\n[BLOCKED] Cloudflare challenge detected!")
return False
print("\n[SUCCESS] Request completed successfully!")
return True
except (
cloudscraper.CloudflareChallengeError,
Exception,
) as e:
print(f"\n[ERROR] Request failed: {type(e).__name__}: {e}")
return False
def test_with_browser_params():
"""Test with explicit browser parameters"""
print("\n\nTesting with browser parameters...")
print("-" * 60)
# Create scraper with browser params
scraper = cloudscraper.create_scraper(
browser={"browser": "chrome", "platform": "windows", "desktop": True}
)
url = "https://www.prosportstransactions.com/"
try:
print(f"Requesting: {url}")
response = scraper.get(url, timeout=30)
print(f"Status Code: {response.status_code}")
if response.status_code == 200 and "Checking your browser" not in response.text:
print("[SUCCESS] Homepage accessible!")
return True
print("[BLOCKED] Still getting challenged")
return False
except (
cloudscraper.CloudflareChallengeError,
Exception,
) as e:
print(f"[ERROR] {type(e).__name__}: {e}")
return False
if __name__ == "__main__":
# Run tests
basic_success = test_basic_request()
browser_success = test_with_browser_params()
# Summary
print("\n" + "=" * 60)
print("SUMMARY:")
print(f"Basic request: {'PASSED' if basic_success else 'FAILED'}")
print(f"Browser params: {'PASSED' if browser_success else 'FAILED'}")
# Exit with appropriate code
sys.exit(0 if basic_success or browser_success else 1)
Results:
Testing cloudscraper with prosportstransactions.com...
------------------------------------------------------------
Requesting: https://www.prosportstransactions.com/basketball/Search/SearchResults.php?Player=&Team=&BeginDate=&EndDate=&PlayerMovementChkBx=yes&submit=Search
Status Code: 403
Headers: {'Date': 'Sun, 06 Jul 2025 13:19:15 GMT', 'Content-Type': 'text/html; charset=UTF-8', 'Transfer-Encoding': 'chunked',
'Connection': 'close', 'accept-ch': 'Sec-CH-UA-Bitness, Sec-CH-UA-Arch, Sec-CH-UA-Full-Version, Sec-CH-UA-Mobile, Sec-CH-UA-
Model, Sec-CH-UA-Platform-Version, Sec-CH-UA-Full-Version-List, Sec-CH-UA-Platform, Sec-CH-UA, UA-Bitness, UA-Arch, UA-
Full-Version, UA-Mobile, UA-Model, UA-Platform-Version, UA-Platform, UA', 'cf-mitigated': 'challenge', 'critical-ch': 'Sec-CH-UA-
Bitness, Sec-CH-UA-Arch, Sec-CH-UA-Full-Version, Sec-CH-UA-Mobile, Sec-CH-UA-Model, Sec-CH-UA-Platform-Version, Sec-CH-
UA-Full-Version-List, Sec-CH-UA-Platform, Sec-CH-UA, UA-Bitness, UA-Arch, UA-Full-Version, UA-Mobile, UA-Model, UA-Platform-
Version, UA-Platform, UA', 'cross-origin-embedder-policy': 'require-corp', 'cross-origin-opener-policy': 'same-origin', 'cross-origin-
resource-policy': 'same-origin', 'origin-agent-cluster': '?1', 'permissions-policy': 'accelerometer=(),autoplay=(),browsing-topics=
(),camera=(),clipboard-read=(),clipboard-write=(),geolocation=(),gyroscope=(),hid=(),interest-cohort=(),magnetometer=
(),microphone=(),payment=(),publickey-credentials-get=(),screen-wake-lock=(),serial=(),sync-xhr=(),usb=()', 'referrer-policy':
'same-origin', 'server-timing': 'chlray;desc="95af6489f9d98fc2", cfL4;desc="?
proto=TCP&rtt=13467&min_rtt=13374&rtt_var=5082&sent=3&recv=5&lost=0&retrans=0&sent_bytes=2914&recv_bytes=1255&
delivery_rate=218334&cwnd=251&unsent_bytes=0&cid=16606c71f1323bf6&ts=34&x=0"', 'x-content-type-options': 'nosniff', 'x-
frame-options': 'SAMEORIGIN', 'Cache-Control': 'private, max-age=0, no-store, no-cache, must-revalidate, post-check=0, pre-
check=0', 'Expires': 'Thu, 01 Jan 1970 00:00:01 GMT', 'Report-To': '{"endpoints":[{"url":"https:\\/\\/a.nel.cloudflare.com\\/report\\/v4?
s=hSqyK83uZuFpVDLIC7qtgep4OK%2Bx1ruPA10XjiBTUbgMFZS%2BRi5YtnDjnRCbyKKpgJpxXtsXyjMVC%2BTFd1rQ1BXOEkqtt1Y%2
FNGI8aMOLehlRx%2B9HMA0uMMsnRTHDA4ex80RmKV9mYkXasO5gYZRwtQ%3D%3D"}],"group":"cf-nel","max_age":604800}',
'NEL': '{"success_fraction":0,"report_to":"cf-nel","max_age":604800}', 'Vary': 'Accept-Encoding', 'Server': 'cloudflare', 'CF-RAY':
'95af6489f9d98fc2-ORD', 'Content-Encoding': 'br', 'alt-svc': 'h3=":443"; ma=86400'}
Content Length: 7629 bytes
Content Preview (first 500 chars):
<!DOCTYPE html><html lang="en-US"><head><title>Just a moment...</title><meta http-equiv="Content-Type"
content="text/html; charset=UTF-8"><meta http-equiv="X-UA-Compatible" content="IE=Edge"><meta name="robots"
content="noindex,nofollow"><meta name="viewport" content="width=device-width,initial-scale=1"><style>*{box-sizing:border-
box;margin:0;padding:0}html{line-height:1.15;-webkit-text-size-adjust:100%;color:#313131;font-family:system-ui,-apple-
system,BlinkMacSystemFont,Segoe UI,Roboto,Helvetic
[BLOCKED] Cloudflare challenge detected!
Testing with browser parameters...
------------------------------------------------------------
Requesting: https://www.prosportstransactions.com/
Status Code: 403
[BLOCKED] Still getting challenged
============================================================
SUMMARY:
Basic request: FAILED
Browser params: FAILED
The website you sent always serves cloudflare challenge - even with a normal browser. It is probably configured like that. What I don't understand is the problem..? Isn't this library meant to solve these challenges using JS?
@xAffan - Yes. That is my understanding of the library; however, the version of Cloudflare being served may be beyond the tested versions of this library:
README.md...
📊 Test Results All features tested with 100% success rate for core functionality:
✅ Basic requests: 100% pass rate ✅ User agent handling: 100% pass rate ✅ Cloudflare v1 challenges: 100% pass rate ✅ Cloudflare v2 challenges: 100% pass rate ✅ Cloudflare v3 challenges: 100% pass rate ✅ Stealth mode: 100% pass rate
The website you sent always serves cloudflare challenge - even with a normal browser. It is probably configured like that. What I don't understand is the problem..? Isn't this library meant to solve these challenges using JS?
due to the nature and complexity of some of the checks in the challenges now, solving this with basic JS engines will fail, that is why i am rewriting this library and the cluster fuck of a PR that was sent that i merged, and now regret
I will provide a couple options for solving, but i also have a life outside of this repo as well as a day job, so if there is a delay in me sitting down to complete said work... then so be it..
@VeNoMouS Please update Repo bro , you are number 1 in github can do this
@VeNoMouS - Same boat as an open source maintainer myself. Totally understand the need for balance.
@VeNoMouS Come on, waiting for your good news
@rsforbes
Hi, do I understand correctly that you're trying to bypass the initial Cloudflare protection page on prosportstransactions.com using Cloudscraper? If so, a better approach might be to use Unflare, which is specifically designed for bypassing the initial protection page. Once it bypasses the page, it'll give you the desired headers, so you can pass them to your Cloudscraper service and make direct http requests to the target website.
Hi, do I understand correctly that you're trying to bypass the initial Cloudflare protection page on prosportstransactions.com using Cloudscraper? If so, a better approach might be to use Unflare, which is specifically designed for bypassing the initial protection page. Once it bypasses the page, it'll give you the desired headers, so you can pass them to your Cloudscraper service and make direct http requests to the target website.
Sweet! I'll check it out. Thanks!
@rsforbes Btw, wanted to ask you what makes you think that Cloudflare detects you due to TLS fingerprinting?
@rsforbes You can change the ja3/ja4 fingerprint by creating your own proxy server. However, the code structure is quite weak, and the JavaScript emulation libs cannot fully simulate a complete browser environment. The code may work for 1-2 days, but it will not work on the third day. https://github.com/VeNoMouS/cloudscraper/blob/9ea528a8675f1bebd49ff853d142e94988a95178/cloudscraper/cloudflare_v3.py#L154-L198