stumpy
stumpy copied to clipboard
Allow user to use pre-calculated `mp` in `stumpy.stumpi.py`
It seems that stumpy.stumpi.py needs to calculate matrix profile (mp) in the beginning of the algorithm.
https://github.com/TDAmeritrade/stumpy/blob/576de379994896063be71d264e48a08ae17476cf/stumpy/stumpi.py#L118
And, then it will be updated quickly for each new individual, scaler value t (because, I think it just needs to calculate a new distance profile ~for~ considering t and update matrix profile.
But, what if the initial T is big? In that case, user may want to use stumped or gpu_stump to get mp, and then pass it to stumpy.stumpi.py. Or, maybe they store it somewhere and decide to use it later.
So, I was wondering if it would be reasonable to do this:
def __init__(..., mp=None):
"""
mp: numpy.ndarray, default None
the matrix profile of `T`. If user haven't calculated it yet, set it to None. Otherwise, it is user's responsibility to make sure the input
`mp` is correct.
"""
if mp is None:
mp = stump(self._T, self._m) # user can now calculate the `mp` using `gpu_stump`
Sure, something like that is fine but I think we should worry about it later. For most users, they rarely care about long time series when then use stumpi and, instead, they only care about some small streaming window.
In this discussion, I believe that there is a legitimate use case for allowing the user to provide their own matrix profile. One thing we need to keep in mind is that we mean the full matrix profile (global and left) as well as the corresponding indices (global and left).