openTSNE icon indicating copy to clipboard operation
openTSNE copied to clipboard

precomputed knn

Open jlmelville opened this issue 1 year ago • 0 comments

Hello, is there a way to use k-nearest neighbors data created externally? My current strategy is to create a dummy class of the form:

class PrecomputedKNNIndex:
    def __init__(self, indices, distances):
        self.indices = indices
        self.distances = distances
        self.k = indices.shape[1]

    def build(self):
        return self.indices, self.distances

    def query(self, query, k):
        raise NotImplementedError("No query with a pre-computed knn")

    def check_metric(self, metric):
        if callable(metric):
            pass
        return metric

and use it like:

import openTSNE

perplexity = 30
data = get_data_fom_somewhere()

n_neighbors = min(data.shape[0] - 1, int(3 * perplexity))
# assume this doesn't return the "self" neighbor as the first item in the knn
indices, dists = get_nn_from_somewhere(data, n_neighbors)
knn = PrecomputedKNNIndex(indices, dists)

affinities = openTSNE.affinity.PerplexityBasedNN(
    perplexity=perplexity,
    knn_index=knn,
)
embedder = openTSNE.TSNE(n_components=2)
embedded = embedder.fit(data, affinities=affinities)

This seems to work perfectly well, just wondered if I am missing a more obvious approach.

jlmelville avatar Sep 17 '22 19:09 jlmelville