kmath icon indicating copy to clipboard operation
kmath copied to clipboard

Feature-by-case request: Extracting cycles & trend from time series data

Open sa18 opened this issue 3 years ago • 2 comments

There is a numeric series of temporal data, for example, temperature.

We need to get information about what cycles the series has (at what amplitude, at what period and so on). One possible method of doing this is Empirical Mode Decomposition, a.k.a. Huang transform, which has positive practical feedbacks (and not very clear theoretical basis, although). And has some advantages over classical methods like Fourier-decomposition.

http://perso.ens-lyon.fr/patrick.flandrin/emd.html

Expectations from math library:

  1. Optimal numerical series data representation
  2. Convenient ability to operate time-series like a + b for adding two series, or s.diff() for computing per-element changes, locate extreme points
  3. Cubic Splines

sa18 avatar Nov 02 '21 16:11 sa18

Thanks a lot for the use case!

We have all tools covered right now:

  1. Buffers
  2. BufferAlgebra
  3. Interpolation API (it already works on buffers).

We can add a scope for time series analysis that would include additional tools to work directly on Buffers so it would be easy to integrate everything with existing libraries.

I will have some questions though. For example, do we need missing value functionality for time series? We have an effective buffer implementation including missing values here, but it will require some additional work.

altavir avatar Nov 02 '21 16:11 altavir

do we need missing value functionality for time series?

There are no missing values is this case.

In ML, structures like LabeledDoubleBuffer might be helpful, where Label is a type of generic, for example, java.awt.Color as in #427. Or, to avoid complicated generics, it can always be Int, so we can use LabeledDoubleBuffer instead of Buffer<Pair<Double, Int>>. But this is not 1-st priority.

sa18 avatar Nov 02 '21 17:11 sa18