yahooquery
yahooquery copied to clipboard
BUG: Ticker.formatted=False formats some dates as strings
Describe the bug
Documentation indicates that using Ticker(format=False) or Ticker.formatted = False provides raw data. However, some timestamps are formatted into strings instead of being provided as raw data.
To Reproduce Steps to reproduce the behavior:
import yahooquery
stock = 'ZM'
yqt = yahooquery.Ticker([stock], format=False)
price_data = yqt.price
for key in ['regularMarketTime', 'postMarketTime', 'preMarketTime']:
if key in price_data[stock]:
print(f"{key: <17s} is type {type(price_data[stock][key])} with value: {price_data[stock][key]}")
Output of the above code:
regularMarketTime is type <class 'str'> value: 2021-07-26 14:00:02
postMarketTime is type <class 'int'> value: 1627331433
preMarketTime is type <class 'str'> value: 2021-07-26 07:29:58
(Note that sometimes one of the above timestmaps will be missing, which appears to be a normal effect of the yahoo api and not related to this library.)
Expected behavior
I would expect that with formatted=False, all timestamps are int data types (raw data) not a mixture of int (raw) and str (formatted) data.
A string timestamp is useful for display (formatted data), but is useless for python code examining date-time values. An integer timestamp is expected in this case, as it can be utilized directly or easily converted into a python datetime or pandas Timestamp object. The fact that some dates remain integers while other dates are formatted as strings makes this bug particularly irritating to work with.
The formatted strings are particularly irritating because they have been converted into the local timezone, not a timezone representative of the market in question or UTC.
Environment
- OS: MacOS BigSur (11.2.1)
- Python: 3.9.5
- yahooquery: 2.2.15
Addional Context
URL generated for the above example of this bug:
https://query2.finance.yahoo.com/v10/finance/quoteSummary/ZM?modules=price&formatted=false&lang=en-US®ion=US&corsDomain=finance.yahoo.com
Note that all time values in the above URL are integer values.
For anyone who also needs a workaround for this bug, here is the fix I have used in my code using pandas, tzlocal and pytz
from __future__ import annotations
import pandas as pd
import pytz
import tzlocal
def convert_datetime(value: str|int) -> pd.Timestamp:
"""Convert String/Integer time stamps to pandas timestamp objects.
"""
# Obtain the local timezone
local = tzlocal.get_localzone()
if type(value) == str:
# Convert string timestamps to time object
# ... and change the timezone into US/Eastern so it represents the market's time and not local time
dt = pd.Timestamp(value, tz=local).tz_convert(pytz.timezone("US/Eastern"))
elif type(value) == int:
# Convert integer timestamps to a datetime object
# ... and change the timezone into US/Eastern so it represents the market's time and not local time
dt = pd.Timestamp.fromtimestamp(value).tz_localize(local).tz_convert(pytz.timezone("US/Eastern"))
# Just in case another possibility appears, return the provided value
else:
dt = value
# return the correct local timezone
return dt
Not ideal that the target timezone is hard coded into the function, however my project only deals with US markets...
Only solved part of this issue with #117 as I need to think more about how to deal with timestamps here (probably shouldn't be using local)