autofaiss consider using distributed kmeans in distributed mode for a better training

consider using distributed kmeans in distributed mode for a better training

Open rom1504 opened this issue 2 years ago • 1 comments

https://github.com/facebookresearch/faiss/blob/b8fe92dfee9ea6f9c8cae27e4fc3ffeb12b5c4d2/benchs/distributed_ondisk/README.md#distributed-k-means

Mar 07 '22 01:03 rom1504

https://github.com/facebookresearch/faiss/tree/main/benchs/distributed_ondisk guide is very nice in general in particular their concept of verticale slice (what we do with subindices in our merging strategy) vs hslice (they split the ivf in inverted lists slices in order to distributed the index) is really interesting for sharding the index between multiple machines (they used that for a 1T items index POC)

Mar 09 '22 22:03 rom1504

autofaiss autofaiss copied to clipboard

consider using distributed kmeans in distributed mode for a better training

autofaiss
autofaiss copied to clipboard