pyFTS how to use hyperparams

I am struggle to find guidance about how to use hyperparam modul such as grid search or evolutionary. anyone can share ?

thank you

Sep 12 '20 15:09 ramdhan1989

Hi @ramdhan1989

Thanks for your interest in our tool, and forgive-me for the long delay.

First of all, before hyperparameter optimization (hereafter called hyperopt), you should perform the time series analysis (such as ACF/PACF plots, tests of stationarity and scedasticity, etc). Hyperopt does not unuseful to know how your time-series data behaves.

The hyperparameter optimization of FTS is described here, and is called DEHO - Distributed Evolutionary Hyperparameter Optimization, but there are other methods then evolutionary in the library. The return of the method will be a dictionary with the best parameters found for forecasting the dataset using the selected FTS method (in the parameter fts_method).

Below a list of the implemented methods:

Grid Search (GS) is very accurate but also very computationally expensive.

from pyFTS.hyperparam import GridSearch
from pyFTS.models import hofts
from pyFTS.data import TAIEX

datasetname = 'TAIEX'
dataset = TAIEX.get_data()

#The list of hyperparameters search spaces
hyperparams = {
    'order': [1, 2, 3],
    'partitions': np.arange(10,100,3),
    'partitioner': [1,2],   #GridSearch, EntropySearch, ...
    'mf': [1, 2, 3],    #Triangular, Trapezoidal and Gaussian
    'lags': np.arange(2, 7, 1),  # The lag indexes
    'alpha': np.arange(.0, .5, .05)  #Alpha Cut
}

GridSearch.execute(
        hyperparams,              #A dictionary containing the search spaces for each hyperparameter
        datsetname,                 #Just the name of your dataset 
        dataset,                        #Your time series data (list or np.ndarray 1d)
        fts_method=hofts.WeightedHighOrderFTS,     # the FTS method you want to optimize [only univariate methods]
        window_size=10000,    #The length of the data window for the Sliding Window Cross Validation method
        train_rate=.9,                #The proportion of the data window that will be used for training, the remaining will be used for test
        increment_rate=.3,       #The sliding increment the Sliding Window Cross Validation method
        database_file='hyperopt.db'   #A sqlite database that will contain the log of the hyperopt process
)

There is no GridSearch implementation yet for multivariate methods.

Random Search (RS) is computationally cheap but may not correctly converge, so it is not the more accurate method. Currently RS is implemented only for MVFTS.


from pyFTS.hyperparam import mvfts as deho_mv
from pyFTS.models.multivariate import mvfts, wmvfts
from pyFTS.models.seasonal.common import DateTime
from pyFTS.data import Malaysia

dataset = Malaysia.get_dataframe()
dataset['time'] = pd.to_datetime(data["time"], format='%m/%d/%y %I:%M %p')

explanatory_variables =[
    {'name': 'Temperature', 'data_label': 'temperature', 'type': 'common'},
    {'name': 'Daily', 'data_label': 'time', 'type': 'seasonal', 'seasonality': DateTime.minute_of_day, 'npart': 24 },
    {'name': 'Weekly', 'data_label': 'time', 'type': 'seasonal', 'seasonality': DateTime.day_of_week, 'npart': 7 },
    {'name': 'Monthly', 'data_label': 'time', 'type': 'seasonal', 'seasonality': DateTime.day_of_month, 'npart': 4 },
    {'name': 'Yearly', 'data_label': 'time', 'type': 'seasonal', 'seasonality': DateTime.day_of_year, 'npart': 12 }
]

target_variable = {'name': 'Load', 'data_label': 'load', 'type': 'common'}

deho_mv.random_search(
              datsetname,                 #Just the name of your dataset 
              dataset,                        #Your time series data (pd.DataFrame)
              npop=200,                   #Size of population of the RS
              mgen=70,                    #Number of iterations of the RS
              fts_method=wmvfts.WeightedMVFTS,   #The multivariate FTS method to optimize
              variables=explanatory_variables,           #The list of exogenous/explanatory variables
              target_variable=target_variable,             #The endogenous/target variable
              window_size=10000,    #The length of the data window for the Sliding Window Cross Validation method
              train_rate=.9,                #The proportion of the data window that will be used for training, the remaining will be used for test
             increment_rate=.3,       #The sliding increment the Sliding Window Cross Validation method
              )

Genetic Algorithm (GA) is between GS and RS, both in accuracy and computational cost.

from pyFTS.hyperparam import Evolutionary
from pyFTS.models import hofts
from pyFTS.data import TAIEX

datasetname = 'TAIEX'
dataset = TAIEX.get_data()

ret = Evolutionary.execute(
        datsetname,                 #Just the name of your dataset 
        dataset,                        #Your time series data (list or np.ndarray 1d)
        fts_method=hofts.WeightedHighOrderFTS,     # the FTS method you want to optimize [only univariate methods]
        ngen=30,                      #Number of generations, the number of iterations of the GA
        npop=20,                      #The size of population of the GA
        psel=0.6,                      #Probability of selection  of the GA
        pcross=.5,                    #Probability of crossover  of the GA
        pmut=.3,                       #Probability of mutation  of the GA
        window_size=10000,    #The length of the data window for the Sliding Window Cross Validation method
        train_rate=.9,                #The proportion of the data window that will be used for training, the remaining will be used for test
        increment_rate=.3,       #The sliding increment the Sliding Window Cross Validation method
        experiments=1,             #Number of hyperopt experiments to perform
        database_file='hyperopt.db'   #A sqlite database that will contain the log of the hyperopt process
)

Please, do not hesitate to get in touch if you have any questions.

Best regards

Sep 13 '20 13:09 petroniocandido

Thanks, all those three method work ! after executing hyperparameter optimization, does the model fitted automatically using the best params ? or we need to take the value from the output dict and fit the model ? Would you mind elaborating more about the dict ? I am confused the values belong to which parameter ? from your code using GA : Experiment 0 Evaluating initial population 1600098526.9596627 GENERATION 0 1600098526.9596627 WITHOUT IMPROVEMENT 1 GENERATION 1 1600098526.9606583 WITHOUT IMPROVEMENT 2 GENERATION 2 1600098526.9626496 WITHOUT IMPROVEMENT 3 GENERATION 3 1600098526.963645 WITHOUT IMPROVEMENT 4 GENERATION 4 1600098526.9656367 WITHOUT IMPROVEMENT 5 GENERATION 5 1600098526.9666321 WITHOUT IMPROVEMENT 6 GENERATION 6 1600098526.9686234 WITHOUT IMPROVEMENT 7 ('TAIEX', 'Evolutive', 'hofts', None, 1, 3, 2, 40, 0.5, '[2, 6, 7]', 'rmse', inf) ('TAIEX', 'Evolutive', 'hofts', None, 1, 3, 2, 40, 0.5, '[2, 6, 7]', 'size', inf) ('TAIEX', 'Evolutive', 'hofts', None, 1, 3, 2, 40, 0.5, '[2, 6, 7]', 'time', 0.010952949523925781)

below is the return dict : {'alpha': 0.5, 'f1': inf, 'f2': inf, 'lags': [2, 6, 7], 'mf': 1, 'npart': 40, 'order': 3, 'partitioner': 2, 'rmse': inf, 'size': inf, 'time': 0.010952949523925781}

Sep 14 '20 15:09 ramdhan1989

Hi @ramdhan1989

Using this dictionary you can build a model with this code:

from pyFTS.hyperparam import Evolutionary

model = Evolutionary.phenotype(
     dictionary,   #the result of the hyperparameter method
     train,            #The train dataset
     fts_method  #the FTS method
)

Best regards

Sep 14 '20 16:09 petroniocandido

well thanks a lot @petroniocandido . does the hyperparams optimization search the best data transformation as well ? such as how many lags for differential ? or may be what kind of transformations is the best for the problem ?

thank you

Sep 15 '20 11:09 ramdhan1989

Hi @petroniocandido , how can I get stable prediction using GA ? every time I run it will result different values. do you have suggestion ?

Sep 21 '20 07:09 ramdhan1989

Hi @petroniocandido , I come back to try using this package. Just want to clarify several things :

how to use Transformation differential into hyperparam optimization ?
using evolutionary, I got rmse "nan". is it good ?
is it possible to use other eval metric ? such as rmsle (root mean sq log error) ?

appreciate for your answers

thank you

Jan 07 '21 10:01 ramdhan1989

pyFTS pyFTS copied to clipboard

how to use hyperparams

pyFTS
pyFTS copied to clipboard