private-gpt add exception handling for document load errors

add exception handling for document load errors

Open pulinagrawal opened this issue 2 years ago • 1 comments

trafficstars

I encountered a PDFSyntaxError. When loading multiple documents the errors should not crash the loading of the documents instead notify the user of the errors.

Jun 05 '23 15:06 pulinagrawal

This works as described for me with a few epub files failing from ERROR: Pandoc died with exitcode "64" during conversion: xmlns not in namespaces and the Unexpected EOF on a few PDFs as well.

Loading new documents:  71%|████████████▊     | 197/276 [00:04<00:01, 62.00it/s] - source_documents/TuffShit/Computer_Books/30 Assorted Computers and Technology Books Collection April 17, 2021/Springer Handbook of Power Systems by Konstantin O. Papailiou.pdf: ERROR: Unexpected EOF
Loading new documents:  79%|██████████████▎   | 219/276 [00:37<00:09,  5.79it/s]
Loaded 219 new documents from source_documents
Split into 15726 chunks of text (max. 500 tokens each)
Creating embeddings. May take some minutes...
Ingestion complete! You can now run privateGPT.py to query your documents

Jun 13 '23 11:06 MsJamie

private-gpt private-gpt copied to clipboard

add exception handling for document load errors

private-gpt
private-gpt copied to clipboard