stumpy
stumpy copied to clipboard
Ensure 2D Array Matrix Profile Outputs
As we move toward supporting top-k matrix profiles, we need to ensure consistency of our outputs and they need to be 2D instead of 1D.
This is related to #592 and #639
@NimaSarajpoor If you comment on this then I can assign it to you
@seanlaw
@NimaSarajpoor If you comment on this then I can assign it to you
Sure. I am going to work on this after we develop top-k matrix profile feature for both normalized and non-normalized methods.
@NimaSarajpoor A part of me is starting to doubt this choice of forcing everything to be 2D. I'm just trying to think through the majority of use cases and that we should target that as being the most "common" scenario. I'm guessing that 99% of the time users will only care about k=1 (and therefore they only care about 1D output). What do you think?
And then in the (less than) 1% case, some users will choose k > 1 but they would/should expect the output to be 2D in those cases.
I'm guessing that 99% of the time users will only care about
k=1(and therefore they only care about 1D output). What do you think?
I think 2D output for left/right might be a little bit too much.
But what about P and I? Personally speaking, I think we can go with 1D output for k=1 since, as you said, the majority of users care about k=1. Although the user needs to simply do .reshape(-1, ) for k=1 when output is 2D, I think 1D is still better because that is what a user probably expects to see in the output.
Your vision is definitely better than mine :) so, please ignore what I said if it does not make sense to you 😄
So, behind the scenes (i.e., with private functions), I think it is fine to just keep everything as 2D. However, when we are able to return P and I separately to the user (e.g., stream.P_), then maybe we should make check to see if P.shape[1] == 1 and k == 1: and then return a 1D array. Otherwise, return 2D. Something like that?
Certainly, for stumpy.stump where P and I are squashed into a single 2D array then it doesn't matter and we are still good. It's really on the rare cases where (we use a class) it is tricky.
then maybe we should make check to see
if P.shape[1] == 1 and k == 1:and then return a 1D array. Otherwise, return 2D. Something like that?
Yeah...that would be a good idea... Most users care about public API and it would be better(?) to see 1D for k=1 as this is what a user usually expects in such case.
Certainly, for
stumpy.stumpwherePandIare squashed into a single 2D array then it doesn't matter and we are still good. It's really on the rare cases where (we use a class) it is tricky.
Correct... that is the tricky part :)
Let's continue thinking about it. This is a good exercise in planning out the design and how our decisions may ultimately affect others. My goal is to minimize the pain/problems for the majority of people.
@NimaSarajpoor Is this technically completed? Can it be closed?
@seanlaw I believe so. We have decided to go with always-2D for only private functions. So, I think it should be okay to close this :)
Awesome! Thanks for the confirmation