faiss
faiss copied to clipboard
Creating an array of faiss.Kmeans objects that uses gpu
Context: I have a dataset with say 1.000 classes and I want to perform K-means with gpu over each class. E.g. (sketch):
class KMeansModule:
def __init__(self, nb_classes, dimensionality=256, n_iter=10, k=5, max_iter=300):
self.k = k
self.d = dimensionality
self.n_iter = n_iter
self.n_kmeans = [faiss.Kmeans(d=dimensionality, k=k, niter=1, gpu=True, verbose=True) for _ in nb_classes]
def assign(self, x_i, y_i):
# Train K-means model for one iteration to initialize centroids
self.n_kmeans[y_i].train(x_i)
# Assign vectors to the nearest cluster centroid
D, I = self.n_kmeans[y_i].index.search(x_i, 1)
return D, I
But then I came across into this which states: "All GPU indexes are built with a StandardGpuResources object (which is an implementation of the abstract class GpuResources). The resource object contains needed resources for each GPU in use, including an allocation of temporary scratch space (by default, about 2 GB on a 12 GB GPU), cuBLAS handles and CUDA streams."
Therefore I was worried about running into memory issues considering that I will have one kmeans object for each class. Is there anyway to modify those settings? Should I be worried at all?
Suggestions are appreciated. Thanks in advance.