rrcf
rrcf copied to clipboard
Dealing with data-stream of constant values during a certain period
In certain cases, a stream may continue to get constant values for a while. Sometimes, in this case xmin=xmax resulting in l=nan, thereby leading to an exception in the following code:
def _cut(self, X, S, parent=None, side='l'): # Find max and min over all d dimensions xmax = X[S].max(axis=0) xmin = X[S].min(axis=0)
# Compute l
l = xmax - xmin
l /= l.sum()
Any suggestions to deal with this "special case" gracefully!
I do not think the algorithm is well-defined for the case where all points are exactly identical, because you cannot partition the point set.
https://klabum.github.io/rrcf/tree-construction.html
In this case, you would essentially skip the tree construction algorithm and create a root node that is also a leaf that contains all the points in the set.