Leland McInnes

Results 487 comments of Leland McInnes

There remain some issues with this -- it's still being worked on. Apologies. In the meantime it is worth noting that if you normalize your vectors then euclidean distance is...

If you can figure out the "right" way to make it work I would appreciate it. There are a few approaches and but they each have some drawbacks. I would...

One can make custom kd-trees and ball-trees that support angular distance (I have done this, but not folded it in -- I don't want to support essentially duplicates of code...

I understand, but that is simply one of the issues with computing an all-pairs distance matrix. I'll see if I can work out something else reasonable. On Thu, Mar 9,...

The casting to doubles -- in general pairwise distance matrices are expensive and not a good way to go if you can help it. I should make that easier however....

Sadly at this time, no. Current efficient methods require the use of space trees which, in turn, require metrics that admit the triangle inequality. Unfortunately this is not true of...

There are no noob questions. I honestly can't say for sure -- it depends on how the document vectors were generated. If they are the result of PCA (or similar)...

Nothing jumps out at me. The likely cause is that most of the time is spent shuffling memory around to and from the CPU. These days memory bandwidth to the...

You would need to normalize across samples to make euclidean approximately the same as cosine -- you are essentially projecting all the samples onto a unit n-sphere, and then (small)...

I believe this is a bug, and not the intended behaviour.