yfinance
yfinance copied to clipboard
Fix threads implementation
I invested some time fixing the threads implementation and also adding other improvements such as:
- Retries with backoff
- Normalize output for single tickers
- Remove the shared dfs, prune to race conditions
I'd appreciate it if you can provide some feedback.
@jrdi Although I agree with your new implementation, it would be great if you could provide (at least) 1 example which was failing without your fix and now works with it.
@fredrik-corneliusson I believe you are a heavy user of download()
- is this PR worth resolving & merging in?
Yes I think the threads implementation would be in need of a improvement, any exceptions raised in threads results in the whole download hanging.
@jrdi Although I agree with your new implementation, it would be great if you could provide (at least) 1 example which was failing without your fix and now works with it.
I'm not sure if this issue is local to the machine I was working with (M1 Macbook Pro, macOS 13.5 Ventura, Python 3.11.4), but for large downloads (7000+ tickers over 2 years), I get RuntimeError: can't start new thread
. This doesn't stop the download execution immediately, but it consistently hangs mid way through. The specific list of tickers I used to cause the issue is from NASDAQ's API.
I fixed the issue by implementing a yf_download_batches()
function that splits large download()
calls into smaller ones and combines the data using pandas.concat()
. My function only saves adjusted close data, but I think the changes made by @jrdi would also fix the issue I was having by handling the RuntimeError.
Anyone can submit a pull request, just saying.