fastembed
fastembed copied to clipboard
ONNXRuntime taking up too much memory
ONNXRuntime takes up too much memory (more like accumulating cause I believe it's not freeing up unused memory), when trying to embed large collections of data. Am I missing something or is this a problem with the runtime itself. Iam trying to embed about 10000 documents (Average Size - 3000 characters) using the JinaAI Colbert Model (Late Interaction Model) GPU - Tesla T4 16 GB VRam
What's the batch size you're using? Are you keeping the embeddings in memory or are you uploading them somewhere else / write to disk? Colbert embeddings are quite huge since colbert produce 128-dim embedding per token
Dumping them into pickle files every 1000 docs, batch size is just 1
As for now, I was able to reproduce the issue and it indeed seems like a problem with onnxruntime not freeing up the space However, we might need more time to investigate it, thank you for pointing it out
Hey, any updates on this?
Yeah, we're working on a fix https://github.com/qdrant/fastembed/pull/493
I'm also facing the same issue. Is there any update on this feature ?