umap icon indicating copy to clipboard operation
umap copied to clipboard

KeyError using metric='precomputed'

Open groceryheist opened this issue 3 years ago • 1 comments

Here's my code:

import umap
import numpy as np
import pandas as pd
cossims = pd.read_feather("wiki_rule_cosinesimilarities.feather")
distmat = 1 - np.matrix(cossims.iloc[:,0:cossims.shape[0]],'double')
my_model = umap.UMAP(metric='precomputed')
my_model.fit(X=distmat)

I'm attaching the data in a zipfile. wiki_rule_cosinesimilarities.feather.zip

Thank you!

groceryheist avatar Apr 19 '22 22:04 groceryheist

I think the issue here is that your distance matrix has negative values -- I ran your snippet and checked np.min(distmat) to find -1.1920928955078125e-07.

When I clipped the matrix at 0, the error was no longer thrown. Here's the code I used:

import umap
import numpy as np
import pandas as pd
cossims = pd.read_feather("../../Downloads/wiki_rule_cosinesimilarities.feather")
distmat = 1 - np.matrix(cossims.iloc[:,0:cossims.shape[0]],'double')
distmat = np.clip(np.asarray(distmat), 0, np.max(distmat))
my_model = umap.UMAP(metric='precomputed')
my_model.fit(X=distmat)

ggdna avatar Feb 22 '24 13:02 ggdna