sqlite-vss icon indicating copy to clipboard operation
sqlite-vss copied to clipboard

Very slow for moderate number of embeddings

Open Nintorac opened this issue 8 months ago • 2 comments

Here is a visual on how ingest time scales versus number of embeddings. If I log both axis' it looks approximately linear.

I also noticed that there only seems to be a single thread running for the entire duration of the ingest.

I am using embed dings with dimension 2560.

image

I am using python and have installed sqlite-vss via pip if that makes a difference

Nintorac avatar Oct 24 '23 03:10 Nintorac

Do you happen to have the code you used to ingest embeddings into sqlite-vss? It shouldn't take 30 mins to insert 30k vectors. I suspect there's a number fixes that could be made to make it much faster, including:

  • Insert all vectors in one transaction (surround with BEGIN and COMMIT)
  • Avoid execute() and prefer executemany() if in Python
  • Insert vectors in all one go (depends on the source of your vectors)

Also depends if you're using a custom factory or now, so any example code would be great!

asg017 avatar Dec 08 '23 18:12 asg017

i have lost the code sorry. If I remember right this was to create the index after all the data has been inserted

Nintorac avatar Dec 16 '23 04:12 Nintorac