Suggestion to improve or fine tune the model with custom documents

Open SelvakumarTS opened this issue 2 years ago • 1 comments

I have close to 3 PDF documents. Both LLAMA and MISTRAL works reasonably well. Are there any levers available in the code to improve the similarity search or better embeddings or fine tune the model to improve the results. Does removing the pictures in the document improves the accuracy?

Jan 13 '24 14:01 SelvakumarTS

You can try upgrading to a better sentence transformer. Increase chunksize for documents and also increase number of source documents to recall.

Jan 16 '24 15:01 LeafmanZ