byaldi icon indicating copy to clipboard operation
byaldi copied to clipboard

index() corruption

Open declanraj opened this issue 1 year ago • 0 comments

Hi, I have been trying to run the indexing on a set of 80 pdf documents (~150 pages each) by submitting batch jobs. Since the indexing took longer than expected (8 hours) my session ended abruptly and I get a "ValueError: Expected object or value" when I try to read from_index().

I don't see any method to discard the partially indexed document and continue from the last valid index. This would mean I need to start from the top for another 8+ hours. Is it possible to have some functionality to deal with this situation?

declanraj avatar Nov 20 '24 03:11 declanraj