[0.2.58/59] Yahoo may return bad crumb which is not detected properly
Describe bug
From time to time I'm getting:
requests.exceptions.HTTPError: 429 Client Error: Too Many Requests for url: https://query1.finance.yahoo.com/v7/finance/quote?symbols=INGR&formatted=false&crumb=Too+Many+Requests%0D%0A
Note crumb=Too+Many+Requests%0D%0A - this specific value shall be treated as no crumb is available.
Simple code that reproduces your problem
Any yfinance query.
Debug log from yf.enable_debug_mode()
DEBUG get_raw_json(): https://query2.finance.yahoo.com/v10/finance/quoteSummary/GOOG
DEBUG:yfinance:get_raw_json(): https://query2.finance.yahoo.com/v10/finance/quoteSummary/GOOG
DEBUG Entering get()
DEBUG:yfinance:Entering get()
DEBUG Entering _make_request()
DEBUG:yfinance: Entering _make_request()
DEBUG url=https://query2.finance.yahoo.com/v10/finance/quoteSummary/GOOG
DEBUG:yfinance: url=https://query2.finance.yahoo.com/v10/finance/quoteSummary/GOOG
DEBUG params={'modules': 'financialData,quoteType,defaultKeyStatistics,assetProfile,summaryDetail', 'corsDomain': 'finance.yahoo.com', 'formatted': 'false', 'symbol': 'GOOG'}
DEBUG:yfinance: params={'modules': 'financialData,quoteType,defaultKeyStatistics,assetProfile,summaryDetail', 'corsDomain': 'finance.yahoo.com', 'formatted': 'false', 'symbol': 'GOOG'}
DEBUG Entering _get_cookie_and_crumb()
DEBUG:yfinance: Entering _get_cookie_and_crumb()
DEBUG cookie_mode = 'basic'
DEBUG:yfinance: cookie_mode = 'basic'
DEBUG Entering _get_cookie_and_crumb_basic()
DEBUG:yfinance: Entering _get_cookie_and_crumb_basic()
DEBUG crumb = 'Too Many Requests
'
DEBUG:yfinance: crumb = 'Too Many Requests
'
DEBUG Exiting _get_cookie_and_crumb_basic()
DEBUG:yfinance: Exiting _get_cookie_and_crumb_basic()
DEBUG Exiting _get_cookie_and_crumb()
DEBUG:yfinance: Exiting _get_cookie_and_crumb()
WARNING:requests_cache.session:Request for URL https://query2.finance.yahoo.com/v10/finance/quoteSummary/GOOG?corsDomain=finance.yahoo.com&crumb=REDACTED&formatted=false&modules=financialData%2CquoteType%2CdefaultKeyStatistics%2CassetProfile%2CsummaryDetail&symbol=GOOG failed; using cached response
Traceback (most recent call last):
File "/home/vsukhoml/.venv/lib/python3.12/site-packages/requests_cache/session.py", line 291, in _resend
response.raise_for_status()
File "/home/vsukhoml/.venv/lib/python3.12/site-packages/requests/models.py", line 1024, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 429 Client Error: Too Many Requests for url: https://query2.finance.yahoo.com/v10/finance/quoteSummary/GOOG?modules=financialData%2CquoteType%2CdefaultKeyStatistics%2CassetProfile%2CsummaryDetail&corsDomain=finance.yahoo.com&formatted=false&symbol=GOOG&crumb=Too+Many+Requests%0D%0A
DEBUG response code=200
DEBUG:yfinance: response code=200
DEBUG Exiting _make_request()
DEBUG:yfinance: Exiting _make_request()
DEBUG Exiting get()
DEBUG:yfinance:Exiting get()
DEBUG get_raw_json(): https://query1.finance.yahoo.com/v7/finance/quote?
DEBUG:yfinance:get_raw_json(): https://query1.finance.yahoo.com/v7/finance/quote?
DEBUG Entering get()
DEBUG:yfinance:Entering get()
DEBUG Entering _make_request()
DEBUG:yfinance: Entering _make_request()
DEBUG url=https://query1.finance.yahoo.com/v7/finance/quote?
DEBUG:yfinance: url=https://query1.finance.yahoo.com/v7/finance/quote?
DEBUG params={'symbols': 'GOOG', 'formatted': 'false'}
DEBUG:yfinance: params={'symbols': 'GOOG', 'formatted': 'false'}
DEBUG Entering _get_cookie_and_crumb()
DEBUG:yfinance: Entering _get_cookie_and_crumb()
DEBUG cookie_mode = 'basic'
DEBUG:yfinance: cookie_mode = 'basic'
DEBUG Entering _get_cookie_and_crumb_basic()
DEBUG:yfinance: Entering _get_cookie_and_crumb_basic()
DEBUG reusing crumb
DEBUG:yfinance: reusing crumb
DEBUG Exiting _get_cookie_and_crumb_basic()
DEBUG:yfinance: Exiting _get_cookie_and_crumb_basic()
DEBUG Exiting _get_cookie_and_crumb()
DEBUG:yfinance: Exiting _get_cookie_and_crumb()
WARNING:requests_cache.session:Request for URL https://query1.finance.yahoo.com/v7/finance/quote?crumb=REDACTED&formatted=false&symbols=GOOG failed; using cached response
Traceback (most recent call last):
File "/home/vsukhoml/.venv/lib/python3.12/site-packages/requests_cache/session.py", line 291, in _resend
response.raise_for_status()
File "/home/vsukhoml/.venv/lib/python3.12/site-packages/requests/models.py", line 1024, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 429 Client Error: Too Many Requests for url: https://query1.finance.yahoo.com/v7/finance/quote?symbols=GOOG&formatted=false&crumb=Too+Many+Requests%0D%0A
DEBUG response code=200
DEBUG:yfinance: response code=200
DEBUG Exiting _make_request()
DEBUG:yfinance: Exiting _make_request()
DEBUG Exiting get()
DEBUG:yfinance:Exiting get()
DEBUG Entering get()
DEBUG:yfinance:Entering get()
DEBUG Entering _make_request()
DEBUG:yfinance: Entering _make_request()
DEBUG url=https://query1.finance.yahoo.com/ws/fundamentals-timeseries/v1/finance/timeseries/GOOG?symbol=GOOG&type=trailingPegRatio&period1=1730678400&period2=1746489600
DEBUG:yfinance: url=https://query1.finance.yahoo.com/ws/fundamentals-timeseries/v1/finance/timeseries/GOOG?symbol=GOOG&type=trailingPegRatio&period1=1730678400&period2=1746489600
DEBUG params=None
DEBUG:yfinance: params=None
DEBUG Entering _get_cookie_and_crumb()
DEBUG:yfinance: Entering _get_cookie_and_crumb()
DEBUG cookie_mode = 'basic'
DEBUG:yfinance: cookie_mode = 'basic'
DEBUG Entering _get_cookie_and_crumb_basic()
DEBUG:yfinance: Entering _get_cookie_and_crumb_basic()
DEBUG reusing crumb
DEBUG:yfinance: reusing crumb
DEBUG Exiting _get_cookie_and_crumb_basic()
DEBUG:yfinance: Exiting _get_cookie_and_crumb_basic()
DEBUG Exiting _get_cookie_and_crumb()
DEBUG:yfinance: Exiting _get_cookie_and_crumb()
DEBUG response code=429
DEBUG:yfinance: response code=429
DEBUG toggling cookie strategy basic -> csrf
DEBUG:yfinance: toggling cookie strategy basic -> csrf
DEBUG Entering _get_cookie_and_crumb()
DEBUG:yfinance: Entering _get_cookie_and_crumb()
DEBUG cookie_mode = 'csrf'
DEBUG:yfinance: cookie_mode = 'csrf'
DEBUG Entering _get_crumb_csrf()
DEBUG:yfinance: Entering _get_crumb_csrf()
DEBUG Failed to find "csrfToken" in response
DEBUG:yfinance: Failed to find "csrfToken" in response
DEBUG Exiting _get_crumb_csrf()
DEBUG:yfinance: Exiting _get_crumb_csrf()
DEBUG toggling cookie strategy csrf -> basic
DEBUG:yfinance: toggling cookie strategy csrf -> basic
DEBUG Entering _get_cookie_and_crumb_basic()
DEBUG:yfinance: Entering _get_cookie_and_crumb_basic()
DEBUG crumb = 'Too Many Requests
'
DEBUG:yfinance: crumb = 'Too Many Requests
'
DEBUG Exiting _get_cookie_and_crumb_basic()
DEBUG:yfinance: Exiting _get_cookie_and_crumb_basic()
DEBUG Exiting _get_cookie_and_crumb()
DEBUG:yfinance: Exiting _get_cookie_and_crumb()
DEBUG response code=429
DEBUG:yfinance: response code=429
Bad data proof
No response
yfinance version
0.2.58
Python version
3.12
Operating system
Linux
check @2422 for fix
I did some experiments trying to add proper handling for this, kind of:
if self._crumb is not None and not 'Too Many Requests' in self._crumb:
utils.get_yf_logger().debug(f'reusing crumb {self._crumb}')
return self._crumb
and
if self._crumb is None or '<html>' in self._crumb or 'Too Many Requests' in self._crumb:
utils.get_yf_logger().debug("Didn't receive crumb")
in both basic and csrf, but found that while it correctly addresses detection, i'm sticking into this state and session and cookie reset are needed to recover.
I also added a workaround - "import" crumb from my browser - that partially help, while reset of session and cookie (and its cache if used) might be needed.
Interestingly, that I hit this on the first time after about a week of no use, but I have cached sessions. Could it be an interplay with it?
I tried with to disable cache when updating crumb for cached session, but not sure if it makes a difference alone. Could be cache shall be disabled when updating cookie:
if self._session_is_caching:
get_args['expire_after'] = self._expire_after
with self._session.cache_disabled():
crumb_response = self._session.get(**get_args)
else:
crumb_response = self._session.get(**get_args)
It definitely looks like update of cookie, csrf and crumb have to be done with session caching disabled if it is enabled.
I'm using:
self.session = CachedLimiterSession(
limiter=limiter,
# cache_control=True,
expire_after=timedelta(days=1),
cache_name=cache_requests,
stale_if_error=True,
backend="filesystem",
bucket_class=MemoryQueueBucket,
ignored_parameters=["sessionId", "crumb"],
serializer="pickle",
)
Which results in pretty good caching and error handling (stale_if_error=True) but it may impact handling of these critical parameters as I can't easily add exceptions for certain URLs.
Cookie should be ok cached (until it expires), but crumb certainly not. That's what self._session_is_caching logic tries to stop, but maybe recent updates broke it?