yfinance icon indicating copy to clipboard operation
yfinance copied to clipboard

Difference in historical data between Yahoo Finance webpage and yfinance output.

Open desolator-x opened this issue 10 months ago • 4 comments

Describe bug

Why is there a substantial difference between the historical stock prices (open/high/low/close) shown at the Yahoo Finance webpage compared to the output of yfinance? Only the "Adj Close" and "Volume" columns on the webpage seem to be identical to the "Close" and "Volume" columns in the yfinance output.

For example, take a look at AAPL in January 2024:

https://finance.yahoo.com/quote/AAPL/history/?period1=1704067200&period2=1706745600

Compare those prices to the output of yfinance:


                                 Open        High         Low       Close    Volume
Date                                                                               
2024-01-02 00:00:00-05:00  186.033072  187.315382  182.792533  184.532089  82488700
2024-01-03 00:00:00-05:00  183.120556  184.770652  182.335262  183.150375  58414500
2024-01-04 00:00:00-05:00  181.062914  181.997307  179.800504  180.824356  71983600
2024-01-05 00:00:00-05:00  180.903872  181.669266  179.094727  180.098694  62303300
2024-01-08 00:00:00-05:00  181.003268  184.492330  180.416793  184.452560  59144500
2024-01-09 00:00:00-05:00  182.822345  184.045000  181.639444  184.035065  42841800
2024-01-10 00:00:00-05:00  183.249781  185.287535  182.822340  185.078796  46792900
2024-01-11 00:00:00-05:00  185.426703  185.933669  182.524132  184.482376  49128400
2024-01-12 00:00:00-05:00  184.949573  185.625523  184.084771  184.810410  40444700
2024-01-16 00:00:00-05:00  181.072860  183.160318  179.850190  182.534088  65603000
2024-01-17 00:00:00-05:00  180.188179  181.838260  179.223966  181.589752  47317400
2024-01-18 00:00:00-05:00  184.979407  188.011208  184.720965  187.504257  78005800
2024-01-19 00:00:00-05:00  188.200061  190.804420  187.693110  190.416748  68741000
2024-01-22 00:00:00-05:00  191.152327  194.164242  191.112557  192.732834  60133900
2024-01-23 00:00:00-05:00  193.856118  194.581757  192.673218  194.015152  42355600
2024-01-24 00:00:00-05:00  194.253710  195.207988  193.180154  193.339203  53631300
2024-01-25 00:00:00-05:00  194.054921  195.098658  191.957513  193.011185  54822100
2024-01-26 00:00:00-05:00  193.110602  193.597668  190.794506  191.271637  44594000
2024-01-29 00:00:00-05:00  190.864067  191.052935  188.448577  190.585739  47145600
2024-01-30 00:00:00-05:00  189.800457  190.655325  186.351165  186.917755  55859400
2024-01-31 00:00:00-05:00  185.923728  185.983383  183.249795  183.299484  55467800

Am I missing something here?

Simple code that reproduces your problem

import yfinance as yf
yf.enable_debug_mode()
ticker = yf.Ticker('AAPL')
historical_data = ticker.history(start='2024-01-01', end='2024-02-01')
print(historical_data[['Open', 'High', 'Low', 'Close', 'Volume']])

Debug log

DEBUG    Entering history()
DEBUG     Entering history()
DEBUG      AAPL: Yahoo GET parameters: {'period1': '2024-01-01 00:00:00-05:00', 'period2': '2024-02-01 00:00:00-05:00', 'interval': '1d', 'includePrePost': False, 'events': 'div,splits,capitalGains'}
DEBUG      Entering get()
DEBUG       Entering _make_request()
DEBUG        url=https://query2.finance.yahoo.com/v8/finance/chart/AAPL
DEBUG        params=frozendict.frozendict({'period1': 1704085200, 'period2': 1706763600, 'interval': '1d', 'includePrePost': False, 'events': 'div,splits,capitalGains'})
DEBUG        Entering _get_cookie_and_crumb()
DEBUG         cookie_mode = 'basic'
DEBUG         Entering _get_cookie_and_crumb_basic()
DEBUG          loaded persistent cookie
DEBUG          reusing cookie
DEBUG          crumb = 'b5nkRpIrG9d'
DEBUG         Exiting _get_cookie_and_crumb_basic()
DEBUG        Exiting _get_cookie_and_crumb()
DEBUG        response code=200
DEBUG       Exiting _make_request()
DEBUG      Exiting get()
DEBUG      AAPL: yfinance received OHLC data: 2024-01-02 14:30:00 -> 2024-01-31 14:30:00
DEBUG      AAPL: OHLC after cleaning: 2024-01-02 09:30:00-05:00 -> 2024-01-31 09:30:00-05:00
DEBUG      AAPL: OHLC after combining events: 2024-01-02 00:00:00-05:00 -> 2024-01-31 00:00:00-05:00
DEBUG      AAPL: yfinance returning OHLC: 2024-01-02 00:00:00-05:00 -> 2024-01-31 00:00:00-05:00
DEBUG     Exiting history()
DEBUG    Exiting history()

Bad data proof

Image

yfinance version

0.2.54

Python version

3.12.7

Operating system

Ubuntu 24.10 6.11.0-18-generic

desolator-x avatar Feb 20 '25 10:02 desolator-x

I confirm This is for example, data for BIL, this is not what the web interface shows

[*********************100%***********************]  1 of 1 completed
Price           Close       High        Low       Open     Volume
Ticker            BIL        BIL        BIL        BIL        BIL
Date                                                             
2007-05-01  74.576759  74.576759  74.495345  74.560479      12750
2007-06-01  74.902405  74.902405  74.593028  74.593028     145050
2007-07-01  74.869835  74.902402  74.527891  74.593025     446600
2007-08-01  75.218086  75.316214  74.809203  74.890979    1195200
2007-09-01  75.374123  75.489072  75.160643  75.226330     823300
...               ...        ...        ...        ...        ...
2024-08-01  89.277412  89.287138  88.878770  88.878770  192025600
2024-09-01  89.660309  89.670077  89.298977  89.621245  156932000
2024-10-01  90.038872  90.038872  89.695580  89.705391  140940000
2024-11-01  90.384491  90.404191  90.079207  90.089057  150196500
2024-12-01  90.393311  90.630587  90.254899  90.432858  155048200

andrewmed avatar Feb 20 '25 21:02 andrewmed

Only the "Adj Close" and "Volume" columns on the webpage seem to be identical to the "Close" and "Volume" columns in the yfinance output.

Maybe there's a clue there?

https://ranaroussi.github.io/yfinance/reference/index.html

ValueRaider avatar Feb 20 '25 22:02 ValueRaider

Turns out that setting auto_adjust=False gives identical OHLC data as the Yahoo Finance webpage:

import yfinance as yf
yf.enable_debug_mode()
ticker = yf.Ticker('AAPL')
historical_data = ticker.history(start='2024-01-01', end='2024-02-01', auto_adjust=False)
print(historical_data[['Open', 'High', 'Low', 'Close', 'Adj Close', 'Volume']])

Why does yfinance adjust all Open/High/Low/Close data by default, while Yahoo Finance only adjusts Close data?

desolator-x avatar Feb 21 '25 09:02 desolator-x

Adjusted is safer, no ex-div price drops.

ValueRaider avatar Feb 21 '25 10:02 ValueRaider