paper-qa icon indicating copy to clipboard operation
paper-qa copied to clipboard

IndexError: list index out of range

Open OliverOffing opened this issue 1 year ago • 4 comments

I'm trying to add 250 documents but I'm hitting this error. It seems that this error shows up when trying to add >100 documents. Is that a hard limit I'm hitting or could there be something else, like perhaps one of the files if faulty?

Traceback (most recent call last):
  File "/app/main.py", line 24, in <module>
    docs.add(d)
  File "/usr/local/lib/python3.10/site-packages/paperqa/docs.py", line 111, in add
    citation = self.cite_chain.run(texts[0])
IndexError: list index out of range

OliverOffing avatar Apr 29 '23 00:04 OliverOffing

Actually, the problem was due to a document being too small.

Contents of the file that caused the problem:

# FAQs

Source:

https://github.com/whitead/paper-qa/blob/b94a06e0c668fbf8141cf656f4be846b24fff534/paperqa/docs.py#L109

We should do either one of these:

  • Improve the error message so that when we can't process a file, the file name specific is presented to the user, or
  • outright ignore the file and issue a warning message telling that that file is being ignored

OliverOffing avatar Apr 29 '23 13:04 OliverOffing

Actually, I have the same issue only my document is definitely not too small. I believe that it's too large.

Is there an intuitive way to handle this error?

amittos avatar May 05 '23 12:05 amittos

I'm doing this until we find a fix:

    for d in my_docs:
        try:
            docs.add(d)
        except Exception as e:
            print('Error adding %s: %s' % (d, e))

OliverOffing avatar May 05 '23 13:05 OliverOffing

@OliverOffing's solution is the preferred solution. You add them, get an exception if it fails, and you decide how to deal with the exception. You can skip and continue or try to figure out what is wrong yourself. The best we have is checks to see if it looks like a document.

The specific error above, about the index error for very short documents, has been fixed

whitead avatar Jun 14 '23 06:06 whitead

This looks to be resolved.

We have also just released version 5, which basically rewrites the whole repo. So likely the issue has been eliminated, or will look totally different if not. I am going to close this out, if your issue persists, please reopen a new issue using paper-qa>=5

jamesbraza avatar Sep 11 '24 18:09 jamesbraza