client-python icon indicating copy to clipboard operation
client-python copied to clipboard

Missing Data

Open JeremyWhittaker opened this issue 1 year ago • 3 comments

I'm pulling financial data for symbol LEN. My data frame is below as well as a visual chart for revenues. I'm missing chunks of data. For small companies, this may be normal as perhaps they didn't release updated quarterly financials etc. But this company did in fact release this data. I was able to verify by looking at Yahoo Finance(screenshot also below). Is there going to be a lot of companies where your data isn't accurate like this? When I'm trying to analyze aggregate data this makes it incredibly difficult.

The python function I used to fetch this data is here:

def fetch_fundamental_data(api_key, stock_ticker, filing_date_gte, license_type="free"):
    """
    Fetch financial statement data for a given stock ticker using Polygon API.

    Parameters:
        api_key (str): Your Polygon API key.
        stock_ticker (str): Stock ticker symbol.
        filing_date_gte (str): Start date for the filing date filter.
        license_type (str): Type of license ("free" or "paid")

    Returns:
        list: A list containing the financial statement data.
    """
    # Create a REST client and authenticate with the API key
    client = RESTClient(api_key)

    # Initialize an empty list to hold the data
    data = []

    request_count = 0
    while True:
        try:
            # Fetch the financial statement data and add it to the list
            for t in client.vx.list_stock_financials(ticker=stock_ticker, filing_date_gte=filing_date_gte, limit=100):
                logging.info(f'{stock_ticker}: Got data from API {t}')
                data.append(t)

                # Check if the fetched data is empty
                if not data:
                    logging.warning(f"No data found for stock ticker {stock_ticker}. It may be an invalid symbol.")
                    return None

                request_count += 1

                # Log the latest filing date fetched
                latest_filing_date = t.filing_date  # Replace this with the actual attribute name for the date
                logging.info(f"Latest filing date fetched: {latest_filing_date}")

                if license_type == "free" and request_count >= 5:  # Check if the rate limit is reached
                    time.sleep(60)  # Pause for 60 seconds
                    request_count = 0  # Reset the request count

            break  # Exit the while loop if successful
        except PolygonAPIError as e:
            if "maximum requests per minute" in str(e):
                time.sleep(60)  # Pause for 60 seconds
            else:
                raise  # Re-raise the exception if it's not a rate-limit error

    return data

image

image

     cik       company_name    end_date filing_date fiscal_period  \

0 0000920760 LENNAR CORP /NEW/ 2023-08-31 2023-09-29 Q3
1 0000920760 LENNAR CORP /NEW/ 2023-05-31 2023-06-30 Q2
2 0000920760 LENNAR CORP /NEW/ 2023-02-28 2023-04-04 Q1
3 0000920760 LENNAR CORP /NEW/ 2022-11-30 2023-01-26 FY
4 0000920760 LENNAR CORP /NEW/ 2022-08-31 2022-10-04 Q3
5 0000920760 LENNAR CORP /NEW/ 2022-05-31 2022-07-01 Q2
6 0000920760 LENNAR CORP /NEW/ 2022-02-28 2022-04-01 Q1
7 0000920760 LENNAR CORP /NEW/ 2021-11-30 2022-01-28 FY
8 0000920760 LENNAR CORP /NEW/ 2021-08-31 2021-10-01 Q3
9 0000920760 LENNAR CORP /NEW/ 2021-05-31 2021-07-02 Q2
10 0000920760 LENNAR CORP /NEW/ 2021-02-28 2021-04-01 Q1
11 0000920760 LENNAR CORP /NEW/ 2020-11-30 2021-01-22 FY
12 0000920760 LENNAR CORP /NEW/ 2020-08-31 2020-10-01 Q3
13 0000920760 LENNAR CORP /NEW/ 2020-05-31 2020-07-06 Q2
14 0000920760 LENNAR CORP /NEW/ 2020-02-29 2020-04-07 Q1

fiscal_year current_liabilities equity_attributable_to_parent
0 2023 1.164958e+10 2.565662e+10
1 2023 -2.516112e+10 2.501514e+10
2 2023 -2.455529e+10 2.441826e+10
3 2022 1.374393e+10 2.410050e+10
4 2022 1.221234e+10 2.297728e+10
5 2022 -2.178977e+10 2.159826e+10
6 2022 -2.084743e+10 2.067906e+10
7 2021 1.221150e+10 2.081642e+10
8 2021 1.196465e+10 2.065019e+10
9 2021 -1.970210e+10 1.957611e+10
10 2021 -1.901745e+10 1.889625e+10
11 2020 1.183578e+10 1.799486e+10
12 2020 1.203494e+10 1.717210e+10
13 2020 -1.663262e+10 1.654270e+10
14 2020 -1.619338e+10 1.604460e+10

noncurrent_assets  noncurrent_liabilities  ...  \

0 0 0 ...
1 0 0 ...
2 0 0 ...
3 0 0 ...
4 0 0 ...
5 0 0 ...
6 0 0 ...
7 0 0 ...
8 0 0 ...
9 0 0 ...
10 0 0 ...
11 0 0 ...
12 0 0 ...
13 0 0 ...
14 0 0 ...

net_cash_flow_from_financing_activities  comprehensive_income_loss  \

0 -1.109284e+09 1.117160e+09
1 -5.745670e+08 8.783150e+08
2 -1.483463e+09 6.001590e+08
3 -1.277279e+09 4.652250e+09
4 -4.342220e+08 1.473036e+09
5 -1.322300e+08 1.322620e+09
6 -1.257886e+09 5.123420e+08
7 -2.404735e+09 4.456013e+09
8 -4.840860e+08 1.409349e+09
9 -1.881450e+08 8.370880e+08
10 -6.296880e+08 1.015967e+09
11 -2.446575e+09 2.466250e+09
12 -9.174200e+08 6.664180e+08
13 -1.587660e+08 5.174060e+08
14 -7.898040e+08 3.984520e+08

comprehensive_income_loss_attributable_to_parent  \

0 1.109204e+09
1 8.722670e+08
2 5.973850e+08
3 4.617874e+09
4 1.467686e+09
5 1.320818e+09
6 5.066080e+08
7 4.429575e+09
8 1.407019e+09
9 8.316790e+08
10 1.000427e+09
11 2.463733e+09
12 6.665930e+08
13 5.166160e+08
14 3.984060e+08

other_comprehensive_income_loss  basic_earnings_per_share  \

0 208000.0 3.87
1 573000.0 3.01
2 851000.0 2.06
3 3749000.0 15.74
4 342000.0 5.04
5 62000.0 4.50
6 3027000.0 1.70
7 -536000.0 14.28
8 131000.0 4.52
9 316000.0 2.66
10 -942000.0 3.20
11 -1303000.0 7.88
12 175000.0 2.13
13 -790000.0 1.66
14 -46000.0 1.27

operating_expenses      revenues  cost_of_revenue  gross_profit  symbol  

0 7.258891e+09 8.729603e+09 NaN NaN LEN
1 6.852312e+09 8.045151e+09 NaN NaN LEN
2 5.674155e+09 6.490429e+09 NaN NaN LEN
3 NaN NaN NaN NaN LEN
4 NaN NaN NaN NaN LEN
5 NaN NaN NaN NaN LEN
6 5.400582e+09 6.203516e+09 NaN NaN LEN
7 2.205224e+10 2.713068e+10 NaN NaN LEN
8 5.016908e+09 6.941403e+09 NaN NaN LEN
9 5.228150e+09 6.430245e+09 NaN NaN LEN
10 3.875609e+09 5.325468e+09 NaN NaN LEN
11 1.900665e+10 2.248885e+10 1.774076e+10 4.748090e+09 LEN
12 4.035370e+08 5.870254e+09 4.607704e+09 1.262550e+09 LEN
13 3.857330e+08 5.287373e+09 4.225063e+09 1.062310e+09 LEN
14 4.254360e+08 4.505337e+09 3.656349e+09 8.489880e+08 LEN

[15 rows x 29 columns]

JeremyWhittaker avatar Oct 13 '23 17:10 JeremyWhittaker

Thanks for the heads up @JeremyWhittaker. Thank for the very detailed write up as it helps us track things down quickly. After taking a look, this is more of a data issue/gap than a client library issue so I pinged the backend data team and they will check it out. I'll keep you posted.

justinpolygon avatar Oct 13 '23 17:10 justinpolygon

Thanks for the heads up @JeremyWhittaker. Thank for the very detailed write up as it helps us track things down quickly. After taking a look, this is more of a data issue/gap than a client library issue so I pinged the backend data team and they will check it out. I'll keep you posted.

Appreciate it. I just signed up for your service and I'm trying to find reliable data to analyze, When I run across stuff like this it makes me start to question all of my output.

JeremyWhittaker avatar Oct 13 '23 18:10 JeremyWhittaker

Same symbol, huge chunks of data missing from this metric as well:

image

image

INFO:root: fiscal_period fiscal_year end_date filing_date cost_of_revenue 0 Q3 2023 2023-08-31 2023-09-29 NaN 1 Q2 2023 2023-05-31 2023-06-30 NaN 2 Q1 2023 2023-02-28 2023-04-04 NaN 3 FY 2022 2022-11-30 2023-01-26 NaN 4 Q3 2022 2022-08-31 2022-10-04 NaN 5 Q2 2022 2022-05-31 2022-07-01 NaN 6 Q1 2022 2022-02-28 2022-04-01 NaN 7 FY 2021 2021-11-30 2022-01-28 NaN 8 Q3 2021 2021-08-31 2021-10-01 NaN 9 Q2 2021 2021-05-31 2021-07-02 NaN 10 Q1 2021 2021-02-28 2021-04-01 NaN 11 FY 2020 2020-11-30 2021-01-22 1.774076e+10 12 Q3 2020 2020-08-31 2020-10-01 4.607704e+09 13 Q2 2020 2020-05-31 2020-07-06 4.225063e+09 14 Q1 2020 2020-02-29 2020-04-07 3.656349e+09 15 FY 2019 2019-11-30 2020-01-27 1.802340e+10 16 Q3 2019 2019-08-31 2019-10-08 4.771425e+09 17 Q2 2019 2019-05-31 2019-07-03 4.524303e+09 18 Q1 2019 2019-02-28 2019-04-08 3.142003e+09 19 FY 2018 2018-11-30 2019-01-28 1.688282e+10 20 Q3 2018 2018-08-31 2018-10-09 4.614666e+09 21 Q2 2018 2018-05-31 2018-07-06 4.619019e+09 22 Q1 2018 2018-02-28 2018-04-09 2.464163e+09 23 FY 2017 2017-11-30 2018-01-25 1.021241e+10 24 Q3 2017 2017-08-31 2017-10-10 2.611065e+09 25 Q2 2017 2017-05-31 2017-06-30 2.645017e+09 26 Q1 2017 2017-02-28 2017-04-10 1.918263e+09 27 FY 2016 2016-11-30 2017-01-20 8.754335e+09 28 Q3 2016 2016-08-31 2016-10-04 2.282218e+09 29 Q2 2016 2016-05-31 2016-07-01 2.184292e+09 30 Q1 2016 2016-02-29 2016-04-06 1.594718e+09 31 FY 2015 2015-11-30 2016-01-22 NaN 32 Q3 2015 2015-08-31 2015-10-09 NaN 33 Q2 2015 2015-05-31 2015-07-02 NaN 34 Q1 2015 2015-02-28 2015-04-03 NaN 35 FY 2014 2014-11-30 2015-01-23 NaN 36 Q3 2014 2014-08-31 2014-10-03 NaN 37 Q2 2014 2014-05-31 2014-07-03 NaN 38 Q1 2014 2014-02-28 2014-04-09 NaN 39 FY 2013 2013-11-30 2014-01-28 NaN 40 Q3 2013 2013-08-31 2013-10-10 NaN 41 Q2 2013 2013-05-31 2013-07-10 NaN 42 Q1 2013 2013-02-28 2013-04-09 NaN 43 FY 2012 2012-11-30 2013-01-29 NaN 44 Q3 2012 2012-08-31 2012-10-10 NaN 45 Q2 2012 2012-05-31 2012-07-10 NaN 46 Q1 2012 2012-02-29 2012-04-09 NaN 47 FY 2011 2011-11-30 2012-01-30 NaN 48 Q3 2011 2011-08-31 2011-10-11 NaN 49 Q2 2011 2011-05-31 2011-07-11 NaN 50 Q1 2011 2011-02-28 2011-04-11 NaN 51 Q3 2010 2010-08-31 2010-10-08 NaN

JeremyWhittaker avatar Oct 13 '23 18:10 JeremyWhittaker