yfinance icon indicating copy to clipboard operation
yfinance copied to clipboard

yFinance download API works differently on Linux vs Windows system

Open mihir-sampat-adaptive opened this issue 1 year ago • 5 comments

Describe bug

Issue Description

I encountered an issue where the yf.download API behaves differently on Linux and Windows systems. Specifically, when using the yf.download API with the period set to max, the API returns a proper DataFrame with the expected output on Windows without any errors, as shown below:

[*********************100%**********************]  1 of 1 completed
                    Open          High           Low         Close     Adj Close    Volume
Date
2024-08-05    174.445999    181.694901    174.353607    178.951599    178.951599    0

The response is similar even with a random start date.

However, when I run the exact same code on an Amazon EC2 Linux instance, the download API throws the following error:

YFInvalidPeriodError("%ticker%: Period 'max' is invalid, must be one of ['1d', '5d']")

This discrepancy suggests that the yf.download API behaves differently on Linux compared to Windows.

Steps to Reproduce

  • Use the yf.download API with the period set to max on a Windows machine.
  • Observe that the API returns the expected DataFrame without errors.
  • Run the same code on an Amazon EC2 Linux instance. *Observe the YFInvalidPeriodError error being thrown.

Example ticker: ^XND

Expected Behavior

The yf.download API should return a DataFrame with the expected data without throwing errors, regardless of the operating system.

Actual Behavior

  • Windows: API works as expected, returns a DataFrame with the data.
  • Linux: API throws a YFInvalidPeriodError when the period is set to max.

Environment Details

  • Windows Machine:
    • OS: Windows 11
    • yFinance Version: 0.2.41
    • Python Version: 3.11
  • Amazon EC2 Linux Instance:
    • OS: Amazon Linux 2023
    • yFinance Version: 0.2.41
    • Python Version: 3.11

Additional Information

The issue persists even when a random start date is provided.

This behavior suggests a potential discrepancy in the yFinance API implementation or configuration for different operating systems.

Request for Insight

Any insight into why this discrepancy occurs and how to resolve it would be very helpful. Is there a known issue with yFinance on Linux systems, or is there a workaround to make the behavior consistent across different operating systems?

Thank you for your assistance.

Simple code that reproduces your problem

Code

from yfinance import download

download(tickers=['^XND'], period='max')

Result on Windows

[*********************100%**********************]  1 of 1 completed
                    Open          High           Low         Close     Adj Close    Volume
Date
2024-08-05    174.445999    181.694901    174.353607    178.951599    178.951599    0

Result on Linux

[*********************100%%**********************]  1 of 1 completed

1 Failed download:
['^XND']: YFInvalidPeriodError("%ticker%: Period 'max' is invalid, must be one of ['1d', '5d']")
Empty DataFrame
Columns: [Open, High, Low, Close, Adj Close, Volume]
Index: []

But when this same query was implemented using a 1d or 5d period it worked as expected.

Debug log

[*********************100%%**********************]  1 of 1 completed

1 Failed download:
['^XND']: YFInvalidPeriodError("%ticker%: Period 'max' is invalid, must be one of ['1d', '5d']")
Empty DataFrame
Columns: [Open, High, Low, Close, Adj Close, Volume]
Index: []

Bad data proof

No response

yfinance version

0.2.41

Python version

3.11

Operating system

Windows 11, Amazon Linux 2023

mihir-sampat-adaptive avatar Aug 06 '24 10:08 mihir-sampat-adaptive

That's not the debug log.

ValueRaider avatar Aug 06 '24 12:08 ValueRaider

One thing you can do is try running the code in wsl ( windows subsystem for Linux) , and see if the error is still there.

Inder782 avatar Aug 13 '24 06:08 Inder782

I can confirm that the error also occurs on windows subsystem for Linux (Windows 10)

cgmike avatar Aug 23 '24 16:08 cgmike

This has to do with pytz not being able to handle year data past year 2038. When you use max it adds 99 years to the current date, which goes past year 2048. https://github.com/stub42/pytz/issues/31

if start or period is None or period.lower() == "max":
    # Check can get TZ. Fail => probably delisted
    tz = self.tz
    if tz is None:
        # Every valid ticker has a timezone. A missing timezone is a problem.
        _exception = YFTzMissingError(self.ticker)
        err_msg = str(_exception)
        shared._DFS[self.ticker] = utils.empty_df()
        shared._ERRORS[self.ticker] = err_msg.split(': ', 1)[1]
        if raise_errors:
            raise _exception
        else:
            logger.error(err_msg)
        return utils.empty_df()
    if end is None:
        end = int(_time.time())
    else:
        end = utils._parse_user_dt(end, tz)
    if start is None:
        if interval == "1m":
            start = end - 604800   # 7 days
        elif interval in ("5m", "15m", "30m", "90m"):
            start = end - 5184000  # 60 days
        elif interval in ("1h", '60m'):
            start = end - 63072000  # 730 days
        else:
            start = end - 3122064000  # 99 years
    else:
        start = utils._parse_user_dt(start, tz)
    params = {"period1": start, "period2": end}
else:
    period = period.lower()
    params = {"range": period}

WoxxyG avatar Aug 26 '24 17:08 WoxxyG

I found that when running the following code on EC2: stockNames = ['A', 'AAA', 'AAPL', 'NVDA', 'CNQ', 'SNA', 'META'] for stockName in stockNames: Ticker = yf.Ticker(stockName) # Get dividend and split information actions_data = Ticker.actions

I only get data from 2022 and earlier, and cannot obtain the latest data. However, the code works fine and retrieves the latest data when run on a local Windows machine.

ww-hub-user avatar Sep 04 '24 08:09 ww-hub-user