stumpy icon indicating copy to clipboard operation
stumpy copied to clipboard

Add Tutorial(s) that Reproduce "100 Time Series Data Mining Questions" PDF

Open seanlaw opened this issue 5 years ago • 3 comments

On the UCR Matrix Profile site, they have a growing list of time series questions that can be solved by computing the matrix profile. The PDF can be found here and the corresponding code/data is here.

It would be interesting to begin compiling a STUMPY examples that reproduces the solutions to those questions below (including data sources).

Additionally, there is this other paper titled "Ten Useful Things you can do with the Matrix Profile and Ten Lines of Code" that might be worth reproducing

seanlaw avatar Jan 02 '20 15:01 seanlaw

1. Have we ever seen a pattern that looks just like this?

The AIBO Robot Dog Data can be found here

import urllib
import ssl
import io
import os
import pandas as pd
import stumpy
import numpy as np

context = ssl.SSLContext()  # Ignore SSL certificate verification for simplicity

T_url = 'https://www.cs.unm.edu/~mueen/robot_dog.txt'
T_raw_bytes = urllib.request.urlopen(T_url, context=context).read()
T_data = io.BytesIO(T_raw_bytes)

Q_url = 'https://www.cs.unm.edu/~mueen/carpet_query.txt'
Q_raw_bytes = urllib.request.urlopen(Q_url, context=context).read()
Q_data = io.BytesIO(Q_raw_bytes)

T_df = pd.read_csv(T_data, header=None, sep='\s+', names=['walking'])
Q_df = pd.read_csv(Q_data, header=None, sep='\s+', names=['walking'])

distance_profile = stumpy.core.mass(Q_df['walking'], T_df['walking'])

k = 16
idx = np.argpartition(distance_profile, k)[:k]
topK_idx = idx[np.argsort(distance_profile[idx])]

seanlaw avatar Jan 02 '20 16:01 seanlaw

@seanlaw we spoke about getting these up to the point of data loaded and ready to be worked on.

You mentioned Zenodo in another issue? That seems like a good way to do this?

Could preprocess the data into clean pandas friendly csvs, upload and the just call url from read_csv?

Guess it would be good to establish some structure upfront?

MokaPot avatar Nov 12 '20 09:11 MokaPot

For now, please let's continue the discussion around this issue here: https://github.com/seanlaw/awesome-stumpy/issues/1

NimaSarajpoor avatar Sep 16 '22 17:09 NimaSarajpoor