xiongqiangcs
xiongqiangcs
> 我们对hnswlib做了修改,支持增量 修改后的hnswlib支持同时插入和查询吗?
addPoint中会修改linkLists_ https://github.com/vearch/gamma/blob/87e90349e385e5392089b5f5ed88861d55558781/index/impl/hnswlib/hnswalg.h#L556-L581 searchKnn会读取linkLists_ https://github.com/vearch/gamma/blob/87e90349e385e5392089b5f5ed88861d55558781/index/impl/hnswlib/hnswalg.h#L1186-L1194 读写linkLists_没有加锁,如何保证多线程同时插入和查询?
明白了,一写多读,允许多线程读取旧值,这种在多读少写的情况下比较适合,在多写少读的情况会影响召回结果
I encountered the same problem. You might want to consider upgrading to a GPU with more gpu memory.
> You can launch a custom javascript to login with the token. > > Launch `bulkai create-session` command and when the login window appears open developer tools (F12) and paste...
>  After comparison, I found that there are some differences in parameter passing between this project and using Discord for parameter passing, as shown in the figure. This is...
I think the faiss readme is a recommendation rather than a standard, it's a tradeoff between performance and recall. Examples: http://ann-benchmarks.com/sift-128-euclidean_10_euclidean.html but autofaiss https://github.com/criteo/autofaiss/blob/d5c773fa8ab78ae0dddb22cad60832c55eadc999/autofaiss/external/optimize.py#L174-L178 I recommend randomly sampling the embeddings...
> Building an hnsw is indeed one of the slowest adding method, especially with random vectors. This is calling faiss index.add > > If you want to optimize for speed...
embedding_reader parameter max_piece_size and parallel_pieces need optimize? https://github.com/criteo/autofaiss/blob/d5c773fa8ab78ae0dddb22cad60832c55eadc999/autofaiss/indices/build.py#L98-L102
> What kind of local disk do you have ? SSD