timeseries-cv icon indicating copy to clipboard operation
timeseries-cv copied to clipboard

Time-Series Cross-Validation Module

Time-Series Cross-Validation

This python package aims to implement Time-Series Cross Validation Techniques.

The idea is given a training dataset, the package will split it into Train, Validation and Test sets, by means of either Forward Chaining, K-Fold or Group K-Fold.

As parameters the user can not only select the number of inputs (n_steps_input) and outputs (n_steps_forecast), but also the number of samples (n_steps_jump) to jump in the data to train.

The best way to install the package is as follows: pip install timeseries-cv and then use it with import tsxv.

  1. Features
    • Split Train
    • Split Train Val
    • Split Train Val Test
  2. Citation

Features

This can be seen more intuitively using the jupyter notebook: "example.ipynb" Below you can find an example of the usage of each function for the following Time-Series:

timeSeries = array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26])

Split Train

split_train

from tsxv.splitTrain import split_train
X, y = split_train(timeSeries, n_steps_input=4, n_steps_forecast=3, n_steps_jump=2)
train

split_train_variableInput

from tsxv.splitTrain import split_train_variableInput
X, y = split_train_variableInput(timeSeries, minSamplesTrain=10, n_steps_forecast=3, n_steps_jump=3)

split_train_variableInput

Split Train Val

split_train_val_forwardChaining

from tsxv.splitTrainVal import split_train_val_forwardChaining
X, y, Xcv, ycv = split_train_val_forwardChaining(timeSeries, n_steps_input=4, n_steps_forecast=3, n_steps_jump=2)
trainVal - forwardChaining

split_train_val_kFold

from tsxv.splitTrainVal import split_train_val_kFold
X, y, Xcv, ycv = split_train_val_kFold(timeSeries, n_steps_input=4, n_steps_forecast=3, n_steps_jump=2)
trainVal - kFold

split_train_val_groupKFold

from tsxv.splitTrainVal import split_train_val_groupKFold
X, y, Xcv, ycv = split_train_val_groupKFold(timeSeries, n_steps_input=4, n_steps_forecast=3, n_steps_jump=2)
trainVal - groupKFold

Split Train Val Test

split_train_val_test_forwardChaining

from tsxv.splitTrainValTest import split_train_val_test_forwardChaining
X, y, Xcv, ycv, Xtest, ytest = split_train_val_test_forwardChaining(timeSeries, n_steps_input=4, n_steps_forecast=3, n_steps_jump=2)
trainValTest - forwardChaining

split_train_val_test_kFold

from tsxv.splitTrainValTest import split_train_val_test_kFold
X, y, Xcv, ycv, Xtest, ytest = split_train_val_test_kFold(timeSeries, n_steps_input=4, n_steps_forecast=3, n_steps_jump=2)
trainValTest - kFold

split_train_val_test_groupKFold

from tsxv.splitTrainValTest import split_train_val_test_groupKFold
X, y, Xcv, ycv, Xtest, ytest = split_train_val_test_groupKFold(timeSeries, n_steps_input=4, n_steps_forecast=3, n_steps_jump=2)
trainValTest - groupKFold

Citation

This module was developed with co-autorship with Filipe Roberto Ramos (https://ciencia.iscte-iul.pt/authors/filipe-roberto-de-jesus-ramos/cv) for his phD thesis entitled "Data Science in the Modeling and Forecasting of Financial timeseries: from Classic methodologies to Deep Learning". Submitted in 2021 to Instituto Universitário de Lisboa - ISCTE Business School, Lisboa, Portugal.

APA

Ramos, F. (2021). Data Science na Modelação e Previsão de Séries Económico-financeiras: das Metodologias Clássicas ao Deep Learning. (PhD Thesis submitted, Instituto Universitário de Lisboa - ISCTE Business School, Lisboa, Portugal).

@phdthesis{FRRamos2021,
      AUTHOR = {Filipe R. Ramos},
      TITLE = {Data Science na Modelação e Previsão de Séries Económico-financeiras: das Metodologias Clássicas ao Deep Learning},
      PUBLISHER = {PhD Thesis submitted, Instituto Universitário de Lisboa - ISCTE Business School, Lisboa, Portugal},
      YEAR =  {2021}
}