yfinance icon indicating copy to clipboard operation
yfinance copied to clipboard

Getting 429 error (rate-limit) when in loop, suggestion to add try-catch if fails

Open melgazar9 opened this issue 1 year ago • 11 comments

Describe bug

When running a loop over multiple tickers, yahoo flags me even when using a VPN and switching servers multiple times per day. I found that there are many yahoo data sources where this happens, and I'll be happy to go through all of them and share them here if this is approved to be implemented.

Simple code that reproduces your problem

It's a bit tricky to reproduce this error because you would have to run numerous requests over a period of time in order for yahoo to flag you. However, if it returns 429 error we can fix it with the following code block:

try:
    <current code> # (say you set data = requests.get(url)
except data.status_code == 429:
    data = requests.get(url, headers={'User-agent': '<some auto-generated value (e.g. hash) that pretty much never gets repeated.>'})

example where it fails for me

import yfinance as yf
>>> import requests
>>> url = 'https://query2.finance.yahoo.com/v10/finance/quoteSummary//013A.F?modules=institutionOwnership%2CfundOwnership%2CmajorDirectHolders%2CmajorHoldersBreakdown%2CinsiderTransactions%2CinsiderHolders%2CnetSharePurchaseActivity&corsDomain=finance.yahoo.com&'
>>> data = requests.get(url)
>>> data.text
'Too Many Requests\r\n'
>>> data.json
<bound method Response.json of <Response [429]>>

>>> data
<Response [429]>
>>> data2 = requests.get(url, headers={'User-agent': 'some bot generate new value'})
>>> data2
<Response [200]>

Debug log

Bad data proof

No response

yfinance version

0.2.38

Python version

3.10.12

Operating system

using both linux and MacOS

melgazar9 avatar Nov 14 '24 02:11 melgazar9

Im running into this issue also for the same reason, would love a fix. I did also notice 404 dont appear to be getting caught as well, i see the error printed to the screen but any attempt to catch '404' fails using exceptions

escchr avatar Nov 14 '24 07:11 escchr

same reason

bwzheng2010 avatar Nov 14 '24 11:11 bwzheng2010

yfinance sends cookie-crumb with requests so Yahoo could track you despite VPN hopping (which is against their terms btw). Do less spam? https://github.com/ValueRaider/yfinance-cache

ValueRaider avatar Nov 14 '24 21:11 ValueRaider

I've been using this library for a while without an issue, I noticed just 2 days ago that me too I I get flagged for too many requests and the server blocks me, It's not about spamming the server, my strategy as an active trader requires that I iterate over a thousand or two thousand symbols throughout the day so I really don't know what to do except for try to look for an alternative source, I'm not sure why they decided to do this now because before it used to be working 100% fine, could it be temporary? Is there a workaround?

underOATH777 avatar Nov 14 '24 23:11 underOATH777

@ValueRaider I'm not spamming just because I'm requesting data on a daily basis. The yfinance-cache fork does not contain the same amount of endpoints as the main yfinance libray so that's not a viable solution. It also does not work with my meltano tap here: https://github.com/melgazar9/tap-yfinance due to referencing a cached file.

This issue and can be resolved by passing a User-agent to the requests header. Is this something that can be implemented? I'm not sure if this library is using the requests library to get the data? If so you can just pass a header and problem solved :)

melgazar9 avatar Nov 14 '24 23:11 melgazar9

I've been using this library for a while without an issue, I noticed just 2 days ago that me too I I get flagged for too many requests and the server blocks me, It's not about spamming the server, my strategy as an active trader requires that I iterate over a thousand or two thousand symbols throughout the day so I really don't know what to do except for try to look for an alternative source, I'm not sure why they decided to do this now because before it used to be working 100% fine, could it be temporary? Is there a workaround?

So im not losing my mind i had just noticed it a couple days as well, thought i broke my code some how. I sure most of the info can be got at nasdaq if you can figure out their weird naming scheme for the datasets.

escchr avatar Nov 15 '24 00:11 escchr

Hitting the 429 error less than 100 tickers in, and also started past day or two. Seems like yahoo has some new rule?

The-Milad-A avatar Nov 15 '24 01:11 The-Milad-A

Adding this while loop right after the call to Ticker() seemed to get me limping along. index = yf.Ticker(ticker) #ADDED THE FOLLOWING retry = True delay = 60 skip = False while retry: try: #force the exception here if the return data sucks print(f'''INDEX: '{type(index.info)}' ''') retry = False except Exception as e: if 'Expecting value:' in str(e): #Error 429 print(f'E: {e}') retry = True print(f'Error: 429 - Sleeping for 60 seconds') time.sleep(delay) else: #Possibly 404 error skip = True retry = False

escchr avatar Nov 15 '24 01:11 escchr

YF should already be setting the user-agent https://github.com/ranaroussi/yfinance/blob/main/yfinance/data.py#L59-L60

Does a 1-second rate limiter help?

from requests_ratelimiter import LimiterSession
history_rate = RequestRate(1, Duration.SECOND)
# history_rate = RequestRate(1, Duration.SECOND*1.1)
limiter = Limiter(history_rate)
session = LimiterSession(limiter=limiter)
session.headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36'}

yfinance PIP downloads are up last 2 weeks, maybe getting too popular? https://pypistats.org/packages/yfinance

ValueRaider avatar Nov 15 '24 09:11 ValueRaider

Adding this while loop right after the call to Ticker() seemed to get me limping along. index = yf.Ticker(ticker) #ADDED THE FOLLOWING retry = True delay = 60 skip = False while retry: try: #force the exception here if the return data sucks print(f'''INDEX: '{type(index.info)}' ''') retry = False except Exception as e: if 'Expecting value:' in str(e): #Error 429 print(f'E: {e}') retry = True print(f'Error: 429 - Sleeping for 60 seconds') time.sleep(delay) else: #Possibly 404 error skip = True retry = False

I had this problem until yesterday, does anyone have a solution? I transfer cash at any cost. Thank you.

nike576 avatar Nov 15 '24 22:11 nike576

YF should already be setting the user-agent https://github.com/ranaroussi/yfinance/blob/main/yfinance/data.py#L59-L60

Does a 1-second rate limiter help?

Yes, seems so, I was able to fetch info for 320 tickers using small code changes:

from requests_ratelimiter import LimiterSession, RequestRate, Limiter, Duration

history_rate = RequestRate(1, Duration.SECOND)
limiter = Limiter(history_rate)
session = LimiterSession(limiter=limiter)

session.headers['User-agent'] = 'tickerpicker/1.0'

isins = [item['isin'] for item in watchlist]
tickers = yf.Tickers(isins, session=session)
...

I had this problem until yesterday, does anyone have a solution? I transfer cash at any cost. Thank you.

@nike576 there is a Patreon link on the repo's main page: https://patreon.com/ranaroussi You could send a little donation to the developers of this useful module.

skoenig avatar Nov 16 '24 14:11 skoenig

Yeah i'm having some issues and concerns over the possible rate limiting on their end. Im doing a less than 100 tickers daily data retrieval and yet im not able to retrieve any data (or only a 1 or 2 times per day trial basis)

BambooOwl avatar Nov 18 '24 03:11 BambooOwl

Just to clarify, it doesn't happen for all endpoints

melgazar9 avatar Nov 18 '24 21:11 melgazar9

from requests_ratelimiter import LimiterSession

prices = yf.download(tickers, period='5d', interval='1d', session=LimiterSession(per_second=3))

daymiani avatar Nov 21 '24 23:11 daymiani

Hey @daymiani the issue is not particularly an issue for me on a regular basis when downloading price history. I notice it's primarily happening for other methods in the library like income_stmt, insider_roster_holders, or upgrades_downgrades

melgazar9 avatar Nov 21 '24 23:11 melgazar9

I seem to be facing a similar issue and the error I get is JSONDecodeError('Expecting value: line 1 column 1 (char 0)')

example: suppose I run yf.download(['JPM'], start='2024-11-16', end='2024-11-18') or even yf.download(['JPM', 'AMZN', 'TSLA'], start='2024-11-16', end='2024-11-18') at the start of the day for some exploratory analyses -- these would initially run successfully

however, my use case requires me to fetch a batch of end-of-day data per day for 1000+ symbols I break up my requests into batches of 100 symbols, and I would run yf.download(batch_symbols_list, start='2024-11-16', end='2024-11-18') -- and this also ran successfully about 2 weeks ago

but now, it seems to download the first few symbols and then returns JSONDecodeError('Expecting value: line 1 column 1 (char 0)') per failed symbol then, if I try to request a single symbol using yf.download(['JPM'], start='2024-11-16', end='2024-11-18'), that also fails

the number of successful symbols varies on each try

any suggestions?

trisxcj1 avatar Nov 26 '24 00:11 trisxcj1

In some users works, in others not as me even if yesterday worked, today wasnt the case. I decided to change to EODHD, if someone wants to have the yfinance experience can try the follow engine https://github.com/Gerard9199/EODHD_API (the main API Class in the EODHD github is some confuse)

Gerard9199 avatar Nov 26 '24 05:11 Gerard9199

Im wondering if the randomness of this is because all of us are hitting their server trying to do the same thing.

escchr avatar Nov 26 '24 21:11 escchr

@bwzheng2010 Redistributing Yahoo's data is against their terms of use.

ValueRaider avatar Nov 30 '24 14:11 ValueRaider

Having problems when "Too many requests" are made. It would be great if an exception was raised as soon as it is detected, so that dev can catch and handle with throttle and retry.

dsidlo avatar Dec 10 '24 22:12 dsidlo

Here is my suggestion to make it more clear when being rate limited #2180

dhruvan2006 avatar Dec 20 '24 16:12 dhruvan2006

Any further resolutions here?

I'm bumping up against rate limits when downloading data in batches (by stock sector) and sub-batches (splitting each individual sector into batches of size 500).

Ideally I'd like a workaround where I can download all of this data at once. Maybe persisting the downloaded data would be the way forward?

(moderation: Was that giant log necessary?)

bastonoxford avatar Jan 07 '25 10:01 bastonoxford

Having the same issue at about 300-400 tickers - this is new in the last few weeks. I was able to download data for many symbols without any rate limiting before that. Any solutions yet?

parthnatekar avatar Jan 27 '25 01:01 parthnatekar

Having the same issue getting api rate limits and just downloading 30 tickers, but i'm getting data from like 2010 up to january 2025

v4lt4ru5 avatar Jan 27 '25 14:01 v4lt4ru5

The rate limits are imposed by Yahoo! Finance. There is nothing we can do.

R5dan avatar Jan 27 '25 18:01 R5dan

Yeah saw that, saw in another issue from yfinance where i described that i changed the data.py file from yfinance library to use a rate limit of 1 sec, it's taking a lot of time but it's okay, just following the tutorial of the FinRL_PortfolioAllocation_NeurIPS_2020.ipynb notebook, trying to get the same shape of data.

The df shape in the tutorial is like nearly (100k, 17) and i was getting first (3k,15) and then (30k,16) or something like that. Let's see now if i can get close to the tutorial shape

The issue i commented on was this one for more context: https://github.com/AI4Finance-Foundation/FinRL/issues/1237#issuecomment-2611200592

I accept any suggestions if you see something wrong here, I'm just finding work arounds to learn how to use finrl with the tutorials until i can tweak my own notebook.

Cheers! Having a blast with this, been some time since i didn't enjoy learning new stuff like this.

v4lt4ru5 avatar Jan 28 '25 08:01 v4lt4ru5

This is a bit weird because when I use my own libraries in python/golang I don't get this rate limit error...

TapeReaderJoe avatar Jan 29 '25 06:01 TapeReaderJoe

I don't know honestly, it's surely taking a lot of time to download now (2008 to 2025 daily, 30 tickers) since i applied that limit, just hit the 18 hours mark and still going. I would gladly pay for a unlimited requests monthly but first i'm trying stuff to see what works for me and fulfills my needs.

v4lt4ru5 avatar Jan 29 '25 06:01 v4lt4ru5

this solved it for me:

stock_data_htf = yf.download('^NSEI', period='5d', interval='15m') stock_data_htf = stock_data_htf.xs('^NSEI', axis=1, level='Ticker')

maynardjameskeenan700L avatar Feb 19 '25 09:02 maynardjameskeenan700L

Confirmed fixed in 0.2.54

It took a couple of mins to realize that

curl https://query2.finance.yahoo.com/v8/test/getcrumb
Edge: Too Many Requests%

However it was not breaking in some of the browsers, so

curl -A "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36" https://query2.finance.yahoo.com/v8/test/getcrumb

would sometimes work, and sometimes not.

Good fix by adding random user agent selector, I wonder what they will do next :(

dimacus avatar Feb 19 '25 13:02 dimacus