py-edgar icon indicating copy to clipboard operation
py-edgar copied to clipboard

IndexError: list index out of range running sample code "To get XBRL data"

Open compusaurusrex opened this issue 4 years ago • 7 comments

Hello Joey!

I want to say thanks so much for working on this project, it is exactly what I hoped to find.

I have been working with the package and sample code some and am getting an index error running your sample code from the section "To get XBRL data".

Specifically, "results" is a list with no elements after running the following line:

results = company.get_data_files_from_10K("EX-101.INS", isxml=True)

The error occurs during execution of the next line:

xbrl = XBRL(results[0])

I originally tried this in a Jupyter notebook, but also tried in an interactive interpreter session to make sure it wasn't just related to the environment. edgar-index-error-interpreter edgar-index-error-jupyter

I would appreciate it if you could provide some assistance and let me know if this problem is a known issue or something I am doing wrong, please?

I specifically would like to collect "facts" from the 10-K reports like number of common shares outstanding, for instance.

Thanks, sincerely

P.S. - I followed the BuyMeACoffee.com link with the intent of providing support, and found you don't have a support button on your page...

compusaurusrex avatar Jan 02 '21 22:01 compusaurusrex

There's no EX-101.INS in the latest Oracle filing https://www.sec.gov/Archives/edgar/data/1341439/000156459020030125/0001564590-20-030125-index.htm

Screenshot from 2021-01-27 22-00-10

Also, thanks for trying to donate, and letting me know. I fixed the link now. If it still doesn't work, let me know too :)

joeyism avatar Jan 27 '21 21:01 joeyism

@joeyism Many of the examples still give me this error:

Traceback (most recent call last):
  File "/home/abhinav/Code/BAM_scripts/test_edgar.py", line 2, in <module>
    edgar = Edgar()
  File "/home/abhinav/.local/lib/python3.9/site-packages/edgar/edgar.py", line 18, in __init__
    _name, _cik = Edgar.split_raw_string_to_cik_name(item)
  File "/home/abhinav/.local/lib/python3.9/site-packages/edgar/edgar.py", line 47, in split_raw_string_to_cik_name
    return ":".join(item_arr[:-1]), item_arr[-1]
IndexError: list index out of range

dumrich avatar Aug 09 '21 12:08 dumrich

@joeyism I have also run into this error intermittently. Any pointers on how to debug this?

zkoppert avatar Aug 18 '21 00:08 zkoppert

@dumrich @zkoppert Can you give me the code you're using to run this? If it happens sometimes, it may be related to specific companies so I'll need as much information as possible to reproduce it

joeyism avatar Aug 26 '21 17:08 joeyism

@joeyism Hi Joey, I also noticed this error when I tried to use edgar module. The error is not a company specific issue, as you can see in the screenshot of @dumrich which shows that traceback occured at line 2 edgar = Edgar()

Luckily, I was also able to pinpoint the reason for this due to one of my other personal project. Recently, due to some changes in sec.gov, when you try to do web scrapping using python for http://www.sec.gov domain, the website usually throws an error saying that you are using unregistered automated tool

this can be easily solved by adding headers in the request.get part of the code.

I've manually added below this in all your edgar module files, and my issue was resolved.

Step 1: Add this at beginning of module files (company.py , edgar.py, etc.): hdr = {'user-agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0'}

Step 2: in the module files, search for requests.get. And in all the requests.get syntax, add headers=hdr after the url.

my issue was resolved after this.

fireball147 avatar Oct 23 '21 20:10 fireball147

related to #29

@fireball147

And no, those are not the correct headers to use.

eabase avatar Feb 07 '22 14:02 eabase

related to #29

@fireball147

And no, those are not the correct headers to use.

You might be correct. I'm not good with programming, i dont have any computer science or coding background... just a newbie here.

I tried this headers based on some other website, and it worked for me. So I mentioned it here. I believe SEC website have information around how the header should be in case of using an automated tool but I didn't go into it in detail as this solution was working fine for me.

fireball147 avatar Feb 07 '22 16:02 fireball147