cuml icon indicating copy to clipboard operation
cuml copied to clipboard

[BUG] Pickle Approximate NearestNeighbors models

Open tfeher opened this issue 2 years ago • 3 comments

Describe the bug

Approximate nearest neighbor models ('ivfflat', 'ivfpq') store their state is a knnIndex object. Currently there is no support to pickle models that were fitted using these algorithms. The error only shows while predicting with the loaded model.

Steps/Code to reproduce bug

import cudf
from cuml.neighbors import NearestNeighbors
from cuml.datasets import make_blobs
X, _ = make_blobs(n_samples=50, centers=5, n_features=10, random_state=42)
X_cudf = cudf.DataFrame(X)

# fit model
model = NearestNeighbors(n_neighbors=3, algorithm='ivfflat')
model.fit(X)

# pickle the model 
import pickle
pickle.dump(model, open("ann_model.pkl", "wb"))

Now start a new process (e.g. new Jupyter kernel)

import cudf
from cuml.neighbors import NearestNeighbors
from cuml.datasets import make_blobs
X, _ = make_blobs(n_samples=50, centers=5, n_features=10, random_state=42)
X_cudf = cudf.DataFrame(X)

import pickle
model_loaded = pickle.load(open("knn_model.pkl", "rb"))

distances2, indices2 = model_loaded.kneighbors(X_cudf)

This will result in the process dying. This is probably due to accessing the model state through knnIndex pointer, which was just saved/restored as int values, but does not point to a valid object if the process is restarted. (One can see this by observing the 'knn_index' value in the dict returned by model.__getstate__()).

Expected behavior Pickling and loading the model shall work. To achieve this ANN models need to serialize / deserialize their knnIndex object while pickling the model.

Environment details (please complete the following information):

  • Tested using 22.04 conda packages.

tfeher avatar May 19 '22 09:05 tfeher

Thank you for spotting that. Unfortunately, it looks like there is no simple solution for this right now. Indeed the knnIndex struct contains GPU resources handled by FAISS. However, if we develop our own ANN algorithms it might become easier to serialize the necessary data though.

viclafargue avatar Jun 02 '22 14:06 viclafargue

Yes, I agree. We can return to this question after https://github.com/rapidsai/raft/pull/652

tfeher avatar Jun 03 '22 07:06 tfeher

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

github-actions[bot] avatar Jul 03 '22 08:07 github-actions[bot]

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

github-actions[bot] avatar Oct 01 '22 08:10 github-actions[bot]