client-python
client-python copied to clipboard
Missing Data
I'm pulling financial data for symbol LEN. My data frame is below as well as a visual chart for revenues. I'm missing chunks of data. For small companies, this may be normal as perhaps they didn't release updated quarterly financials etc. But this company did in fact release this data. I was able to verify by looking at Yahoo Finance(screenshot also below). Is there going to be a lot of companies where your data isn't accurate like this? When I'm trying to analyze aggregate data this makes it incredibly difficult.
The python function I used to fetch this data is here:
def fetch_fundamental_data(api_key, stock_ticker, filing_date_gte, license_type="free"):
"""
Fetch financial statement data for a given stock ticker using Polygon API.
Parameters:
api_key (str): Your Polygon API key.
stock_ticker (str): Stock ticker symbol.
filing_date_gte (str): Start date for the filing date filter.
license_type (str): Type of license ("free" or "paid")
Returns:
list: A list containing the financial statement data.
"""
# Create a REST client and authenticate with the API key
client = RESTClient(api_key)
# Initialize an empty list to hold the data
data = []
request_count = 0
while True:
try:
# Fetch the financial statement data and add it to the list
for t in client.vx.list_stock_financials(ticker=stock_ticker, filing_date_gte=filing_date_gte, limit=100):
logging.info(f'{stock_ticker}: Got data from API {t}')
data.append(t)
# Check if the fetched data is empty
if not data:
logging.warning(f"No data found for stock ticker {stock_ticker}. It may be an invalid symbol.")
return None
request_count += 1
# Log the latest filing date fetched
latest_filing_date = t.filing_date # Replace this with the actual attribute name for the date
logging.info(f"Latest filing date fetched: {latest_filing_date}")
if license_type == "free" and request_count >= 5: # Check if the rate limit is reached
time.sleep(60) # Pause for 60 seconds
request_count = 0 # Reset the request count
break # Exit the while loop if successful
except PolygonAPIError as e:
if "maximum requests per minute" in str(e):
time.sleep(60) # Pause for 60 seconds
else:
raise # Re-raise the exception if it's not a rate-limit error
return data
cik company_name end_date filing_date fiscal_period \
0 0000920760 LENNAR CORP /NEW/ 2023-08-31 2023-09-29 Q3
1 0000920760 LENNAR CORP /NEW/ 2023-05-31 2023-06-30 Q2
2 0000920760 LENNAR CORP /NEW/ 2023-02-28 2023-04-04 Q1
3 0000920760 LENNAR CORP /NEW/ 2022-11-30 2023-01-26 FY
4 0000920760 LENNAR CORP /NEW/ 2022-08-31 2022-10-04 Q3
5 0000920760 LENNAR CORP /NEW/ 2022-05-31 2022-07-01 Q2
6 0000920760 LENNAR CORP /NEW/ 2022-02-28 2022-04-01 Q1
7 0000920760 LENNAR CORP /NEW/ 2021-11-30 2022-01-28 FY
8 0000920760 LENNAR CORP /NEW/ 2021-08-31 2021-10-01 Q3
9 0000920760 LENNAR CORP /NEW/ 2021-05-31 2021-07-02 Q2
10 0000920760 LENNAR CORP /NEW/ 2021-02-28 2021-04-01 Q1
11 0000920760 LENNAR CORP /NEW/ 2020-11-30 2021-01-22 FY
12 0000920760 LENNAR CORP /NEW/ 2020-08-31 2020-10-01 Q3
13 0000920760 LENNAR CORP /NEW/ 2020-05-31 2020-07-06 Q2
14 0000920760 LENNAR CORP /NEW/ 2020-02-29 2020-04-07 Q1
fiscal_year current_liabilities equity_attributable_to_parent
0 2023 1.164958e+10 2.565662e+10
1 2023 -2.516112e+10 2.501514e+10
2 2023 -2.455529e+10 2.441826e+10
3 2022 1.374393e+10 2.410050e+10
4 2022 1.221234e+10 2.297728e+10
5 2022 -2.178977e+10 2.159826e+10
6 2022 -2.084743e+10 2.067906e+10
7 2021 1.221150e+10 2.081642e+10
8 2021 1.196465e+10 2.065019e+10
9 2021 -1.970210e+10 1.957611e+10
10 2021 -1.901745e+10 1.889625e+10
11 2020 1.183578e+10 1.799486e+10
12 2020 1.203494e+10 1.717210e+10
13 2020 -1.663262e+10 1.654270e+10
14 2020 -1.619338e+10 1.604460e+10
noncurrent_assets noncurrent_liabilities ... \
0 0 0 ...
1 0 0 ...
2 0 0 ...
3 0 0 ...
4 0 0 ...
5 0 0 ...
6 0 0 ...
7 0 0 ...
8 0 0 ...
9 0 0 ...
10 0 0 ...
11 0 0 ...
12 0 0 ...
13 0 0 ...
14 0 0 ...
net_cash_flow_from_financing_activities comprehensive_income_loss \
0 -1.109284e+09 1.117160e+09
1 -5.745670e+08 8.783150e+08
2 -1.483463e+09 6.001590e+08
3 -1.277279e+09 4.652250e+09
4 -4.342220e+08 1.473036e+09
5 -1.322300e+08 1.322620e+09
6 -1.257886e+09 5.123420e+08
7 -2.404735e+09 4.456013e+09
8 -4.840860e+08 1.409349e+09
9 -1.881450e+08 8.370880e+08
10 -6.296880e+08 1.015967e+09
11 -2.446575e+09 2.466250e+09
12 -9.174200e+08 6.664180e+08
13 -1.587660e+08 5.174060e+08
14 -7.898040e+08 3.984520e+08
comprehensive_income_loss_attributable_to_parent \
0 1.109204e+09
1 8.722670e+08
2 5.973850e+08
3 4.617874e+09
4 1.467686e+09
5 1.320818e+09
6 5.066080e+08
7 4.429575e+09
8 1.407019e+09
9 8.316790e+08
10 1.000427e+09
11 2.463733e+09
12 6.665930e+08
13 5.166160e+08
14 3.984060e+08
other_comprehensive_income_loss basic_earnings_per_share \
0 208000.0 3.87
1 573000.0 3.01
2 851000.0 2.06
3 3749000.0 15.74
4 342000.0 5.04
5 62000.0 4.50
6 3027000.0 1.70
7 -536000.0 14.28
8 131000.0 4.52
9 316000.0 2.66
10 -942000.0 3.20
11 -1303000.0 7.88
12 175000.0 2.13
13 -790000.0 1.66
14 -46000.0 1.27
operating_expenses revenues cost_of_revenue gross_profit symbol
0 7.258891e+09 8.729603e+09 NaN NaN LEN
1 6.852312e+09 8.045151e+09 NaN NaN LEN
2 5.674155e+09 6.490429e+09 NaN NaN LEN
3 NaN NaN NaN NaN LEN
4 NaN NaN NaN NaN LEN
5 NaN NaN NaN NaN LEN
6 5.400582e+09 6.203516e+09 NaN NaN LEN
7 2.205224e+10 2.713068e+10 NaN NaN LEN
8 5.016908e+09 6.941403e+09 NaN NaN LEN
9 5.228150e+09 6.430245e+09 NaN NaN LEN
10 3.875609e+09 5.325468e+09 NaN NaN LEN
11 1.900665e+10 2.248885e+10 1.774076e+10 4.748090e+09 LEN
12 4.035370e+08 5.870254e+09 4.607704e+09 1.262550e+09 LEN
13 3.857330e+08 5.287373e+09 4.225063e+09 1.062310e+09 LEN
14 4.254360e+08 4.505337e+09 3.656349e+09 8.489880e+08 LEN
[15 rows x 29 columns]
Thanks for the heads up @JeremyWhittaker. Thank for the very detailed write up as it helps us track things down quickly. After taking a look, this is more of a data issue/gap than a client library issue so I pinged the backend data team and they will check it out. I'll keep you posted.
Thanks for the heads up @JeremyWhittaker. Thank for the very detailed write up as it helps us track things down quickly. After taking a look, this is more of a data issue/gap than a client library issue so I pinged the backend data team and they will check it out. I'll keep you posted.
Appreciate it. I just signed up for your service and I'm trying to find reliable data to analyze, When I run across stuff like this it makes me start to question all of my output.
Same symbol, huge chunks of data missing from this metric as well:
INFO:root: fiscal_period fiscal_year end_date filing_date cost_of_revenue 0 Q3 2023 2023-08-31 2023-09-29 NaN 1 Q2 2023 2023-05-31 2023-06-30 NaN 2 Q1 2023 2023-02-28 2023-04-04 NaN 3 FY 2022 2022-11-30 2023-01-26 NaN 4 Q3 2022 2022-08-31 2022-10-04 NaN 5 Q2 2022 2022-05-31 2022-07-01 NaN 6 Q1 2022 2022-02-28 2022-04-01 NaN 7 FY 2021 2021-11-30 2022-01-28 NaN 8 Q3 2021 2021-08-31 2021-10-01 NaN 9 Q2 2021 2021-05-31 2021-07-02 NaN 10 Q1 2021 2021-02-28 2021-04-01 NaN 11 FY 2020 2020-11-30 2021-01-22 1.774076e+10 12 Q3 2020 2020-08-31 2020-10-01 4.607704e+09 13 Q2 2020 2020-05-31 2020-07-06 4.225063e+09 14 Q1 2020 2020-02-29 2020-04-07 3.656349e+09 15 FY 2019 2019-11-30 2020-01-27 1.802340e+10 16 Q3 2019 2019-08-31 2019-10-08 4.771425e+09 17 Q2 2019 2019-05-31 2019-07-03 4.524303e+09 18 Q1 2019 2019-02-28 2019-04-08 3.142003e+09 19 FY 2018 2018-11-30 2019-01-28 1.688282e+10 20 Q3 2018 2018-08-31 2018-10-09 4.614666e+09 21 Q2 2018 2018-05-31 2018-07-06 4.619019e+09 22 Q1 2018 2018-02-28 2018-04-09 2.464163e+09 23 FY 2017 2017-11-30 2018-01-25 1.021241e+10 24 Q3 2017 2017-08-31 2017-10-10 2.611065e+09 25 Q2 2017 2017-05-31 2017-06-30 2.645017e+09 26 Q1 2017 2017-02-28 2017-04-10 1.918263e+09 27 FY 2016 2016-11-30 2017-01-20 8.754335e+09 28 Q3 2016 2016-08-31 2016-10-04 2.282218e+09 29 Q2 2016 2016-05-31 2016-07-01 2.184292e+09 30 Q1 2016 2016-02-29 2016-04-06 1.594718e+09 31 FY 2015 2015-11-30 2016-01-22 NaN 32 Q3 2015 2015-08-31 2015-10-09 NaN 33 Q2 2015 2015-05-31 2015-07-02 NaN 34 Q1 2015 2015-02-28 2015-04-03 NaN 35 FY 2014 2014-11-30 2015-01-23 NaN 36 Q3 2014 2014-08-31 2014-10-03 NaN 37 Q2 2014 2014-05-31 2014-07-03 NaN 38 Q1 2014 2014-02-28 2014-04-09 NaN 39 FY 2013 2013-11-30 2014-01-28 NaN 40 Q3 2013 2013-08-31 2013-10-10 NaN 41 Q2 2013 2013-05-31 2013-07-10 NaN 42 Q1 2013 2013-02-28 2013-04-09 NaN 43 FY 2012 2012-11-30 2013-01-29 NaN 44 Q3 2012 2012-08-31 2012-10-10 NaN 45 Q2 2012 2012-05-31 2012-07-10 NaN 46 Q1 2012 2012-02-29 2012-04-09 NaN 47 FY 2011 2011-11-30 2012-01-30 NaN 48 Q3 2011 2011-08-31 2011-10-11 NaN 49 Q2 2011 2011-05-31 2011-07-11 NaN 50 Q1 2011 2011-02-28 2011-04-11 NaN 51 Q3 2010 2010-08-31 2010-10-08 NaN