hnswlib-node
hnswlib-node copied to clipboard
Returning the same points for every query?
Hey, I don't know if this is something I'm doing wrong, or if it's an issue with the library but thought i'd flag it / ask for help.
I've created vector embeddings for 275,000~ ish words from an english dictionary using ada-002
and i've added them to an index with the code below.
Whenever I search with it, it's always returning the same set of words regardless of what the query embedding is.
Is this a problem with the number of embeddings i'm supplying? Am I doing something else wrong?
Here is my code:
import pkg from 'hnswlib-node'
const { HierarchicalNSW } = pkg
export const createIndexCallback = async (name, dimensions, maxElements, callback) => {
// this needs to *get the element from the callback each time*.
const index = new HierarchicalNSW('l2', dimensions)
index.initIndex(maxElements)
for (let i = 0; i < maxElements; i++) {
const embedding = await callback(i)
index.addPoint(embedding, i)
console.log(`Added ${i} of ${maxElements}`)
}
index.writeIndexSync(`${name}.dat`)
return index
}
export const searchIndex = (name, embedding, k = 5) => {
const index = new HierarchicalNSW('l2', embedding.length)
index.readIndexSync(`${name}.dat`)
const result = index.searchKnn(embedding, k)
console.table(result)
return result
}
~~Perhaps it's that I used l2
rather than cosine
. Since it's high dimensional space that could be detecting all of the elements as being equally
far apart.~~
That made no difference.