stockdex icon indicating copy to clipboard operation
stockdex copied to clipboard

Add date headers to financial statements

Open sgreen7979 opened this issue 1 year ago • 1 comments

It would be helpful to add date headers to the financial statements returned in the Ticker class.

A crude and inelegant fix:


class Ticker(TickerAPI, JustETF):
    """
    Class for the Ticker
    """
    ...

    @property
    def income_stmt(self) -> pd.DataFrame:
        """
        Get income statement for the ticker

        Returns:
        pd.DataFrame: A pandas DataFrame including the income statement
        visible in the Yahoo Finance statistics page for the ticker
        """

        # URL of the website to scrape
        url = f"https://finance.yahoo.com/quote/{self.ticker}/financials"
        response = self.get_response(url)

        # Parse the HTML content of the website
        soup = BeautifulSoup(response.content, "html.parser")

        # <div class="" data-test="fin-row">
        raw_data = soup.find_all("div", {"data-test": "fin-row"})

        data_df = pd.DataFrame()
        dates = []
        for item in raw_data:

            if not dates:
                # collect date headers for data_df
                # e.g., ['12/31/2023', '12/31/2022', '12/31/2021', '12/31/2020']
                dates = re.findall(
                    pattern=r"\d+/\d+/\d{4}", string=item.parent.parent.text
                )
                assert (
                    len(dates) == 4
                ), f"len(dates)={len(dates)}, should be 4 (dates={', '.join(dates)})"

                # column for the most recent period gets collected twice below
                # i.e., ['12/31/2023', '12/31/2023',  '12/31/2022', '12/31/2021', '12/31/2020']
                dates = [dates[0]] + dates

            # get criteria. e.g. "Total Revenue"
            major_div = item.find_all("div", {"class": True})[0]
            criteria = major_div.find_all("div", {"class": True})[0].find("span").text

            # get data. e.g. "274515000000"
            minor_div = major_div.find_all("div", {"class": "Ta(c)"})
            data_list = []
            for div in minor_div:
                data_list.append(div.text)

            data_df[criteria] = data_list

        data_df = data_df.T
        
        # update columns to reflect the date headers
        data_df.columns = dates

        return data_df

sgreen7979 avatar Apr 03 '24 23:04 sgreen7979

I should've noted that the second column of the most recent period reflects the first with adjustments, if any. Typically, although not always, these adjustments are minor, GAAP / audit-related adjustments.

sgreen7979 avatar Apr 05 '24 23:04 sgreen7979

Date is set to be index of the returned dataframes from yahoo modules. you can just transpose the returned DF to get dates as headers instead of indexes. Check out the following example:

from stockdex import Ticker

ticker = Ticker('AAPL')

print(ticker.yahoo_api_income_statement().T)

The result will look like this:

                                                   2020-09-30 2021-09-30 2022-09-30 2023-09-30
annualNetIncomeContinuousOperations                    57.41B     94.68B     99.80B     97.00B
annualTaxEffectOfUnusualItems                            0.00       0.00       0.00       0.00
annualReconciledCostOfRevenue                         169.56B    212.98B    223.55B    214.14B
annualNetIncomeFromContinuingOperationNetMinori...     57.41B     94.68B     99.80B     97.00B
annualTotalOperatingIncomeAsReported                   66.29B    108.95B    119.44B    114.30B
annualTaxRateForCalcs                                    0.14       0.13       0.16       0.15
annualBasicAverageShares                               17.35B     16.70B     16.22B     15.74B
annualReconciledDepreciation                           11.06B     11.28B     11.10B     11.52B
annualNetIncomeCommonStockholders                      57.41B     94.68B     99.80B     97.00B
annualDilutedAverageShares                             17.53B     16.86B     16.33B     15.81B
annualEBIT                                             69.96B    111.85B    122.03B    117.67B
annualNetIncome                                        57.41B     94.68B     99.80B     97.00B
annualNormalizedIncome                                 57.41B     94.68B     99.80B     97.00B
annualOperatingExpense                                 38.67B     43.89B     51.34B     54.85B
annualTotalRevenue                                    274.51B    365.82B    394.33B    383.29B
annualResearchAndDevelopment                           18.75B     21.91B     26.25B     29.91B
annualCostOfRevenue                                   169.56B    212.98B    223.55B    214.14B
annualPretaxIncome                                     67.09B    109.21B    119.10B    113.74B
annualInterestExpense                                   2.87B      2.65B      2.93B      3.93B
annualDilutedNIAvailtoComStockholders                  57.41B     94.68B     99.80B     97.00B
annualTaxProvision                                      9.68B     14.53B     19.30B     16.74B
annualNetIncomeIncludingNoncontrollingInterests        57.41B     94.68B     99.80B     97.00B
annualInterestIncomeNonOperating                        3.76B      2.84B      2.83B      3.75B
annualNormalizedEBITDA                                 81.02B    123.14B    133.14B    129.19B
annualGrossProfit                                     104.96B    152.84B    170.78B    169.15B
annualBasicEPS                                           3.31       5.67       6.15       6.16
annualOtherNonOperatingIncomeExpenses                 -87.00M     60.00M   -228.00M   -382.00M
annualSellingGeneralAndAdministration                  19.92B     21.97B     25.09B     24.93B
annualOtherIncomeExpense                              -87.00M     60.00M   -228.00M   -382.00M
annualOperatingIncome                                  66.29B    108.95B    119.44B    114.30B
annualNetIncomeFromContinuingAndDiscontinuedOpe...     57.41B     94.68B     99.80B     97.00B
annualNetInterestIncome                               890.00M    198.00M   -106.00M   -183.00M
annualOperatingRevenue                                274.51B    365.82B    394.33B    383.29B
annualEBITDA                                           81.02B    123.14B    133.14B    129.19B
annualInterestIncome                                    3.76B      2.84B      2.83B      3.75B
annualTotalExpenses                                   208.23B    256.87B    274.89B    268.98B
annualInterestExpenseNonOperating                       2.87B      2.65B      2.93B      3.93B
annualDilutedEPS                                         3.28       5.61       6.11       6.13
annualNetNonOperatingInterestIncomeExpense            890.00M    198.00M   -106.00M   -183.00M

ahnazary avatar Aug 20 '24 14:08 ahnazary