Leland McInnes
Leland McInnes
Many algorithms, such as word2vec result in nearest neighbor computations based on cosine similarity. Unfortunately, since cosine (dis)similarity is not a metric it can't be used with kd-trees and ball-trees....
I've been working on benchmarking pynndescent on other metrics such as Jaccard, and have been using ``ann-benchmarks`` and the ``kosarak`` dataset for that. Some recent PRs (#235 and #238) have...
Clustering scores like silhouette work well for K-Means but make less sense for density based clustering techniques like DBSCAN which support arbitrary cluster shapes. It would be nice to include...
Currently PyDeck has support for custom layers. See, for example, their [documentation here](https://deckgl.readthedocs.io/en/latest/custom_layers.html). This can be quite powerful since Deck.gl itself makes it relatively easy to build custom layers via...
I have been endeavoring to get more ANN libraries working with sparse Jaccard data in [ann-benchmarks](https::/github.com/erikbern/ann-benchmarks). There are surprisingly few libraries that support this. I was pleased to see that...
It seems a recent round of ann-benchmarks has seen pynndescent fail quite badly. It is not so much a performance degradation as being somewhat broken. I'm looking into it, but...
First of all, thanks for all the work on pomegranate! Possibly related to #402 there seems to be an issue (or a failure of understanding on my part -- also...
As per #81 there are some issues with how transform manages to deal with sparse/dense data. This should be made consistent so that you can (ideally) use any mix of...
The ability to overlay text layers for labelling / annotation would be very useful. Some examples include in ThisNotThat (see the plot at the bottom of [this page](https://thisnotthat.readthedocs.io/en/latest/joint_vector_cluster_labels.html)) or Atlas...