byaldi
byaldi copied to clipboard
Save only necessary embedding files
When indexing big document corpus, the embedding runs slower and slower. The reason is at each iteration, all the embedding vectors are stored, instead of only the newly created ones. The changes in this PR allow to save only the necessary embedding files (new files and file to be modified) instead of all the files.