tapkee
tapkee copied to clipboard
bh-SNE with custom distance callback
Using method=tDistributedStochasticNeighborEmbedding
in combination with withDistance()
is not supported.
Laurens van der Maaten says for using a custom metric, the Vantage-Point Tree needs to be changed (see here. Note that this only refers to the Barnes-Hut algorithm; exact algorithm uses no VPTree and has it's own custom distance computation in tsne.hpp
.
Interestingly, tapkee already comes with an alternative VPTree implementation that supports the use of a distance callback. It also looks quite compatible.
Could the method be altered to use the functionality of neighbors/vptree.hpp
and enable withDistance()
?
We would need a search method in VantagePointTree
that also returns the distances, e.g.:
std::vector<std::pair<IndexType, double>> search(const RandomAccessIterator& target, int k)
And then in the method basically only replace one line:
results.push_back({items[heap.top().index]-begin, heap.top().distance});
That looks promising, thanks for your suggestions!
I do not have good understanding what would happen if we use non-euclidean distance. Have to check.
I think for some data with special characteristics it could be beneficial to try L1 or EMD. But I need to see myself.