machinelearning
machinelearning copied to clipboard
Bring HNSW for fast approximate nearest Neighbour search
Is your feature request related to a problem? Please describe.
LLM uses cosine-similarity to retrieve relevant corpus as memory. While KNN is most commonly used to get k similar items, it's time-consuming on high-dimension vectors.
HNSW will be much efficient in finding approximate k nearest neighbor on high-dimension dataset. There's a csharp implementation we can probably leverage HNSW
Describe the solution you'd like A clear and concise description of what you want to happen.
Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
Additional context Add any other context or screenshots about the feature request here.
@luisquintanilla do we want to add this to our "Future" backlog?
Maybe. I think we'd have to discuss scenarios and get more feedback here.
I also know Bart has been looking into this as well. @bartczernicki Any thoughts here?
https://github.com/bartczernicki/VectorMathAIOptimizations
https://github.com/bartczernicki/hnsw-sharp