paper-qa
paper-qa copied to clipboard
IndexError: list index out of range
I'm trying to add 250 documents but I'm hitting this error. It seems that this error shows up when trying to add >100 documents. Is that a hard limit I'm hitting or could there be something else, like perhaps one of the files if faulty?
Traceback (most recent call last):
File "/app/main.py", line 24, in <module>
docs.add(d)
File "/usr/local/lib/python3.10/site-packages/paperqa/docs.py", line 111, in add
citation = self.cite_chain.run(texts[0])
IndexError: list index out of range
Actually, the problem was due to a document being too small.
Contents of the file that caused the problem:
# FAQs
Source:
https://github.com/whitead/paper-qa/blob/b94a06e0c668fbf8141cf656f4be846b24fff534/paperqa/docs.py#L109
We should do either one of these:
- Improve the error message so that when we can't process a file, the file name specific is presented to the user, or
- outright ignore the file and issue a warning message telling that that file is being ignored
Actually, I have the same issue only my document is definitely not too small. I believe that it's too large.
Is there an intuitive way to handle this error?
I'm doing this until we find a fix:
for d in my_docs:
try:
docs.add(d)
except Exception as e:
print('Error adding %s: %s' % (d, e))
@OliverOffing's solution is the preferred solution. You add them, get an exception if it fails, and you decide how to deal with the exception. You can skip and continue or try to figure out what is wrong yourself. The best we have is checks to see if it looks like a document.
The specific error above, about the index error for very short documents, has been fixed
This looks to be resolved.
We have also just released version 5, which basically rewrites the whole repo. So likely the issue has been eliminated, or will look totally different if not. I am going to close this out, if your issue persists, please reopen a new issue using paper-qa>=5