stockdex
stockdex copied to clipboard
Add date headers to financial statements
It would be helpful to add date headers to the financial statements returned in the Ticker class.
A crude and inelegant fix:
class Ticker(TickerAPI, JustETF):
"""
Class for the Ticker
"""
...
@property
def income_stmt(self) -> pd.DataFrame:
"""
Get income statement for the ticker
Returns:
pd.DataFrame: A pandas DataFrame including the income statement
visible in the Yahoo Finance statistics page for the ticker
"""
# URL of the website to scrape
url = f"https://finance.yahoo.com/quote/{self.ticker}/financials"
response = self.get_response(url)
# Parse the HTML content of the website
soup = BeautifulSoup(response.content, "html.parser")
# <div class="" data-test="fin-row">
raw_data = soup.find_all("div", {"data-test": "fin-row"})
data_df = pd.DataFrame()
dates = []
for item in raw_data:
if not dates:
# collect date headers for data_df
# e.g., ['12/31/2023', '12/31/2022', '12/31/2021', '12/31/2020']
dates = re.findall(
pattern=r"\d+/\d+/\d{4}", string=item.parent.parent.text
)
assert (
len(dates) == 4
), f"len(dates)={len(dates)}, should be 4 (dates={', '.join(dates)})"
# column for the most recent period gets collected twice below
# i.e., ['12/31/2023', '12/31/2023', '12/31/2022', '12/31/2021', '12/31/2020']
dates = [dates[0]] + dates
# get criteria. e.g. "Total Revenue"
major_div = item.find_all("div", {"class": True})[0]
criteria = major_div.find_all("div", {"class": True})[0].find("span").text
# get data. e.g. "274515000000"
minor_div = major_div.find_all("div", {"class": "Ta(c)"})
data_list = []
for div in minor_div:
data_list.append(div.text)
data_df[criteria] = data_list
data_df = data_df.T
# update columns to reflect the date headers
data_df.columns = dates
return data_df
I should've noted that the second column of the most recent period reflects the first with adjustments, if any. Typically, although not always, these adjustments are minor, GAAP / audit-related adjustments.
Date is set to be index of the returned dataframes from yahoo modules. you can just transpose the returned DF to get dates as headers instead of indexes. Check out the following example:
from stockdex import Ticker
ticker = Ticker('AAPL')
print(ticker.yahoo_api_income_statement().T)
The result will look like this:
2020-09-30 2021-09-30 2022-09-30 2023-09-30
annualNetIncomeContinuousOperations 57.41B 94.68B 99.80B 97.00B
annualTaxEffectOfUnusualItems 0.00 0.00 0.00 0.00
annualReconciledCostOfRevenue 169.56B 212.98B 223.55B 214.14B
annualNetIncomeFromContinuingOperationNetMinori... 57.41B 94.68B 99.80B 97.00B
annualTotalOperatingIncomeAsReported 66.29B 108.95B 119.44B 114.30B
annualTaxRateForCalcs 0.14 0.13 0.16 0.15
annualBasicAverageShares 17.35B 16.70B 16.22B 15.74B
annualReconciledDepreciation 11.06B 11.28B 11.10B 11.52B
annualNetIncomeCommonStockholders 57.41B 94.68B 99.80B 97.00B
annualDilutedAverageShares 17.53B 16.86B 16.33B 15.81B
annualEBIT 69.96B 111.85B 122.03B 117.67B
annualNetIncome 57.41B 94.68B 99.80B 97.00B
annualNormalizedIncome 57.41B 94.68B 99.80B 97.00B
annualOperatingExpense 38.67B 43.89B 51.34B 54.85B
annualTotalRevenue 274.51B 365.82B 394.33B 383.29B
annualResearchAndDevelopment 18.75B 21.91B 26.25B 29.91B
annualCostOfRevenue 169.56B 212.98B 223.55B 214.14B
annualPretaxIncome 67.09B 109.21B 119.10B 113.74B
annualInterestExpense 2.87B 2.65B 2.93B 3.93B
annualDilutedNIAvailtoComStockholders 57.41B 94.68B 99.80B 97.00B
annualTaxProvision 9.68B 14.53B 19.30B 16.74B
annualNetIncomeIncludingNoncontrollingInterests 57.41B 94.68B 99.80B 97.00B
annualInterestIncomeNonOperating 3.76B 2.84B 2.83B 3.75B
annualNormalizedEBITDA 81.02B 123.14B 133.14B 129.19B
annualGrossProfit 104.96B 152.84B 170.78B 169.15B
annualBasicEPS 3.31 5.67 6.15 6.16
annualOtherNonOperatingIncomeExpenses -87.00M 60.00M -228.00M -382.00M
annualSellingGeneralAndAdministration 19.92B 21.97B 25.09B 24.93B
annualOtherIncomeExpense -87.00M 60.00M -228.00M -382.00M
annualOperatingIncome 66.29B 108.95B 119.44B 114.30B
annualNetIncomeFromContinuingAndDiscontinuedOpe... 57.41B 94.68B 99.80B 97.00B
annualNetInterestIncome 890.00M 198.00M -106.00M -183.00M
annualOperatingRevenue 274.51B 365.82B 394.33B 383.29B
annualEBITDA 81.02B 123.14B 133.14B 129.19B
annualInterestIncome 3.76B 2.84B 2.83B 3.75B
annualTotalExpenses 208.23B 256.87B 274.89B 268.98B
annualInterestExpenseNonOperating 2.87B 2.65B 2.93B 3.93B
annualDilutedEPS 3.28 5.61 6.11 6.13
annualNetNonOperatingInterestIncomeExpense 890.00M 198.00M -106.00M -183.00M