yahoofinancials icon indicating copy to clipboard operation
yahoofinancials copied to clipboard

Incorrect labeling of dates while fetching date

Open jazarija opened this issue 4 years ago • 2 comments

I am trying to get the price data for CCL.AX for the last few days in the following day

hsp = yf.get_historical_price_data('2019-10-16', '2019-10-18', 'daily')
{'CCL.AX': {'eventsData': {},
 'firstTradeDate': {'formatted_date': '1988-01-28', 'date': 570398400},
 'currency': 'AUD',
 'instrumentType': 'EQUITY',
 'timeZone': {'gmtOffset': 39600},
 'prices': [{'date': 1571180400,
   'high': 10.680000305175781,
   'low': 10.460000038146973,
   'open': 10.5600004196167,
   'close': 10.630000114440918,
   'volume': 2166314,
   'adjclose': 10.630000114440918,
   'formatted_date': '2019-10-15'},
  {'date': 1571266800,
   'high': 10.619999885559082,
   'low': 10.34000015258789,
   'open': 10.609999656677246,
   'close': 10.359999656677246,
   'volume': 4474727,
   'adjclose': 10.359999656677246,
   'formatted_date': '2019-10-16'},
  {'date': 1571375407,
   'high': 10.569999694824219,
   'low': 10.350000381469727,
   'open': 10.40999984741211,
   'close': 10.390000343322754,
   'volume': 2578999,
   'adjclose': 10.390000343322754,
   'formatted_date': '2019-10-18'}]}}

Notice how the price for Oct 17 is missing and the first trade date listed is . In fact, the price data for Oct 16 is in fact the data corresponding to Oct 17!

What is causing this issues? Ideally I'd like trade days to be dated consistently with how they corresponded on the underlying exchange.

jazarija avatar Oct 18 '19 08:10 jazarija

@jazarija

That datetime value comes from the Yahoo Finance API itself.

If you look at the raw date value from the middle item in your list, '1571266800', that translates to 2019-10-16 23:00:00. From my end I do standardize all of the datestamps into UTC, as you can see from lines 112 through 129 in yahoofinancials/init.py. (functions format_date & _convert_to_utc).

Even if the root issue was the converter misidentifying some UTC timestamps for EST timestamps, 4 hours would then just be added to the incorrectly identified EST datetime stamps, which would simply make the 'formatted_date' value for that item '2019-10-17 03:00:00' instead.

My guess is from Yahoo Finance's side the issue could be a timezone issue. I can look into it more later when I have time. However, I did notice when I ran the same code on my local machine, the following returned:

{'CCL.AX': {'eventsData': {},
    'prices': [
        {
             'date': 1571180400, 
             'open': 10.5600004196167, 
             'high': 10.680000305175781, 
             'low': 10.460000038146973, 
             'close': 10.630000114440918, 
             'formatted_date': '2019-10-15',
             'volume': 2166314, 
             'adjclose': 10.630000114440918}, 
        {
            'date': 1571266800,
            'open': 10.609999656677246, 
            'high': 10.619999885559082,
            'low': 10.34000015258789, 
            'close': 10.359999656677246, 
            'formatted_date': '2019-10-16', 
            'volume': 4474727, 
            'adjclose': 10.359999656677246
        }, 
        {
            'date': 1571353200, 
            'open': 10.40999984741211, 
            'high': 10.569999694824219, 
            'low': 10.350000381469727, 
            'close': 10.390000343322754, 
            'formatted_date': '2019-10-17', 
            'volume': 2918973, 'adjclose': 10.390000343322754
        }], 'timeZone': {'gmtOffset': 39600}, 'currency': 'AUD', 'firstTradeDate': {'date': 570398400, 'formatted_date': '1988-01-28'}, 'instrumentType': 'EQUITY'}}

The dates from my machine seem to line up with your expectations. Additionally, we have the same gmtOffset value returned from the Yahoo, '39600'.

I'd be curious to know if you'd return the same data tried as me from running that function again now. This discrepancy maybe caused by Yahoo returning an immature data record. I am assuming based on your issues date tag that you probably ran this function ~4 days ago (prior to 2019-10-19 00:00:00 UTC I am guessing). In that case, 10/18/2019, may not have yet registered as a complete record from Yahoo's end, resulting in pre-revised incomplete data observation being returned.

JECSand avatar Oct 22 '19 11:10 JECSand

I can confirm that I also cannot replicate this specific issue right now. However, there are many spooky inconsistencies that seem to be coming from Yahoo's financial data. For example, its quite common to get duplicated entries as

In [30]: yf.get_historical_price_data('1900-01-01', '2019-10-26', 'daily')['SKIS']['prices'][-2:] Out[30]: [{'date': 1569331800, 'high': 11.0, 'low': 10.989999771118164, 'open': 10.989999771118164, 'close': 11.0, 'volume': 7500, 'adjclose': 11.0, 'formatted_date': '2019-09-24'}, {'date': 1569355201, 'high': 11.0, 'low': 10.989999771118164, 'open': 10.989999771118164, 'close': 11.0, 'volume': 7536, 'adjclose': 11.0, 'formatted_date': '2019-09-24'}]

I am fetching data for a ton of tickers and such inconsistencies (i.e duplicated rows) are everywhere. Any idea how to handle this sensibly? I.e is there a time when Yahoo's financial data is supposed to be in sync?

jazarija avatar Oct 26 '19 07:10 jazarija