stumpy
stumpy copied to clipboard
Add STIMP (Pan Matrix Profile) Tutorial
An initial tutorial has already been created here
We need to add a final example where there are two different window sizes within the same dataset. The data can be found here.
@mexxexx Please feel free to provide any feedback here
Which example in the paper? Introduction fig1 or case study fig13?
Which example in the paper? Introduction fig1 or case study fig13?
@ken-maeda The first example in the tutorial reproduces Figure 1 in the paper.
Thanks, does tutorial snippet have to be same as paper example snippet in such a huge original dataset? Dataset is 21files of csv and each csv file has following columns.
Index(['Time', 'Unix', 'Aggregate', 'Appliance1', 'Appliance2', 'Appliance3',
'Appliance4', 'Appliance5', 'Appliance6', 'Appliance7', 'Appliance8',
'Appliance9'],
dtype='object')
Appliance1 ~ Appliance9 can be candidate.
First csv file shape is (6960008, 12)
The size of paper snippet is 250000. so one csv file include 27.84 times as much as paper's.
It seems no timestamps is noted. Should it be just finding smilar "data shape area"?
・Reproduction of paper
It depends on location of snippet, max(min) window size of stimp.
・Calculation time not sure about acceptable level of calculation time ex)It took about 7mins. Parameter: window size(100-2000) percentage(0.01) dataset length(250K) CPU: i9-12900KF This also should depend parameters and dataset length.
I think following factors have to be roughly decided.
- snippet similarity
- snippet size
- window size (how much difference of two windows is required?)
- (percentage)
@ken-maeda This issue is for tracking the completion of the pan matrix profile tutorial (it is still incomplete). For questions on how to use stimp, can you please post your questions to our Github Discussions?
I intended to ask about the dataset of this tutorial, the data you post is too big. I also tried to complete this tutorial.
Which example in the paper? Introduction fig1 or case study fig13?
@ken-maeda I think there is a misunderstanding. When I said:
The first example in the tutorial reproduces Figure 1 in the paper.
I was referring to the fact that the first example in this tutorial reproduces Figure 1 in this paper.
In case you are trying to contribute a PR for this tutorial, I think most of the coding work is already completed. The only thing that remains is to add a proper narrative to the tutorial.
Oh I see. I completely misunderstood. I thought new case of electrical load data is required. Sorry for causing a trouble.
Oh I see. I completely misunderstood. I thought new case of electrical load data is required. Sorry for causing a trouble.
No problem @ken-maeda! You may be interested in contributing to #85 instead