py-sec-edgar icon indicating copy to clipboard operation
py-sec-edgar copied to clipboard

Some question about program

Open robinzixuan opened this issue 2 years ago • 6 comments

  1. Can it handle the 13F data?
  2. If I only have the cik number, whether I search with cik rather than ticker?

robinzixuan avatar Apr 03 '22 21:04 robinzixuan

Good questions! lol I need to look.

  1. It should be able to get the 13F forms and I think I was able to get the data out at some point, but I'd have to look

  2. i need to check filtering CIK...

ryansmccoy avatar Apr 05 '22 16:04 ryansmccoy

Thanks, I think if we need to get the 13 forms, we might need to use the cik to get them. So it might highly depend on the filtering CIK

robinzixuan avatar Apr 05 '22 18:04 robinzixuan

Gotcha, that make sense because a lot of the funds don't have tickers...

From what I see there isn't a way to filter by CIK, but wouldn't be that tough to add... Do you want to try to give it a shot?

You could follow the same pattern I used for tickers and make one for CIK...

https://github.com/ryansmccoy/py-sec-edgar/blob/127166b8a27dbd80f52fb8b73a19a9aa942bbb62/py_sec_edgar/main.py#L35

you could add something like:


@click.command()
@click.option('--ticker-list', default=CONFIG.TICKER_LIST_FILEPATH)
@click.option('--cik-ticker-list', default=CONFIG.CIK_LIST_FILEPATH)
@click.option('--form-list', default=True)
def main(ticker_list, form_list, cik_ticker_list):
      ...

      if cik_list_filter:
              cik_ticker_list = pd.read_csv(CONFIG.CIK_LIST_FILEPATH, header=None).iloc[:, 0].tolist()
              df_cik_tickers = df_cik_tickers[df_cik_tickers['CIK'].isin(cik_ticker_list )]

ryansmccoy avatar Apr 05 '22 18:04 ryansmccoy

Thanks, I fixed it. One more problem, I found the sec form structure changed after 2011, whether the form before 2012 could not be extracted?

robinzixuan avatar Apr 05 '22 21:04 robinzixuan

If you submit a pull request, I'll add the code to the project and you can be a contributor (if you want).

Regarding the 2011 version, can you share an example so I can see what you mean?

ryansmccoy avatar Apr 06 '22 00:04 ryansmccoy

`2022-04-06 04:48:00,691 INFO py_sec_edgar.extract: extracting documents to /sec_gov/Archives/edgar/data/861439/000091205794003991 /root/py-sec-edgar/py_sec_edgar/parse/header.py:50: FutureWarning: The pandas.np module is deprecated and will be removed from pandas in a future version. Import numpy directly instead. header_dict = header_dict.replace('', pd.np.nan) Traceback (most recent call last): File "/root/anaconda3/envs/py-sec-edgar/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 3621, in get_loc return self._engine.get_loc(casted_key) File "pandas/_libs/index.pyx", line 136, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index.pyx", line 163, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas/_libs/hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: 1

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/root/anaconda3/envs/py-sec-edgar/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/root/anaconda3/envs/py-sec-edgar/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/root/py-sec-edgar/py_sec_edgar/main.py", line 87, in main() File "/root/anaconda3/envs/py-sec-edgar/lib/python3.8/site-packages/click/core.py", line 1130, in call return self.main(*args, **kwargs) File "/root/anaconda3/envs/py-sec-edgar/lib/python3.8/site-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) File "/root/anaconda3/envs/py-sec-edgar/lib/python3.8/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, **ctx.params) File "/root/anaconda3/envs/py-sec-edgar/lib/python3.8/site-packages/click/core.py", line 760, in invoke return __callback(*args, **kwargs) File "/root/py-sec-edgar/py_sec_edgar/main.py", line 81, in main filing_broker.process(sec_filing) File "/root/py-sec-edgar/py_sec_edgar/process.py", line 51, in process filing_content = self.extract(filing_filepaths) File "/root/py-sec-edgar/py_sec_edgar/extract.py", line 28, in extract filing_contents = extract_complete_submission_filing(filing_json['filing_filepath'], output_directory=filing_json['extracted_filing_directory']) File "/root/py-sec-edgar/py_sec_edgar/extract.py", line 74, in extract_complete_submission_filing filing_header = header_parser(raw_text) File "/root/py-sec-edgar/py_sec_edgar/parse/header.py", line 52, in header_parser header_dict[1] = header_dict[1].ffill().bfill().tolist() File "/root/anaconda3/envs/py-sec-edgar/lib/python3.8/site-packages/pandas/core/frame.py", line 3505, in getitem indexer = self.columns.get_loc(key) File "/root/anaconda3/envs/py-sec-edgar/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 3623, in get_loc raise KeyError(key) from err KeyError: 1`

When I run the CIK of 861439, which is a company of AMERICAN MEDICAL HOLDINGS INC

robinzixuan avatar Apr 06 '22 04:04 robinzixuan