umap
umap copied to clipboard
KeyError using metric='precomputed'
Here's my code:
import umap
import numpy as np
import pandas as pd
cossims = pd.read_feather("wiki_rule_cosinesimilarities.feather")
distmat = 1 - np.matrix(cossims.iloc[:,0:cossims.shape[0]],'double')
my_model = umap.UMAP(metric='precomputed')
my_model.fit(X=distmat)
I'm attaching the data in a zipfile. wiki_rule_cosinesimilarities.feather.zip
Thank you!
I think the issue here is that your distance matrix has negative values -- I ran your snippet and checked
np.min(distmat) to find -1.1920928955078125e-07.
When I clipped the matrix at 0, the error was no longer thrown. Here's the code I used:
import umap
import numpy as np
import pandas as pd
cossims = pd.read_feather("../../Downloads/wiki_rule_cosinesimilarities.feather")
distmat = 1 - np.matrix(cossims.iloc[:,0:cossims.shape[0]],'double')
distmat = np.clip(np.asarray(distmat), 0, np.max(distmat))
my_model = umap.UMAP(metric='precomputed')
my_model.fit(X=distmat)