neuralforecast icon indicating copy to clipboard operation
neuralforecast copied to clipboard

⚠️ Exception 'max_epochs is deprecated, use max_steps instead.'

Open kdgutier opened this issue 2 years ago • 0 comments

An epoch corresponds to a complete pass of the model through the training data, while a step refers to the Gradient Descent step performed on a data batch during training. Including both max_steps and max_epochs parameters is redundant since one can be derived from the other.

The following relationships convert max_epochs to max_steps:

# WindowsBased (MLP, NBEATS, NHITS, TFT, Transformers...)
max_steps = (len(Y_train_df) // windows_batch_size) * max_epochs

# RecurrentBased models (like RNN, GRU, LSTM, TCN...)
max_steps = (Y_train_df.unique_id.nunique() // batch_size) * max_epochs

We are deprecating max_epochs in favor of the max_steps parameter due to the following advantages:

  1. Dataset's size independence: The max_epochs parameter is a relative measure that depends on the dataset's size, whereas max_steps is an absolute measure that remains independent of the dataset's size.
    • Steps provide fine control over model evaluation and train termination.
    • Steps directly measure training progress by counting iterations, while epochs can vary based on the dataset's size, leading to inconsistent tracking.
  2. Batch Size Flexibility: Using max_steps to define the training procedure allows to change of Gradient Descent's batch_size without dramatically affecting the algorithm's convergence. In contrast, using max_epochs and changing batch_size will lead to unintended change of train iterations.

kdgutier avatar May 28 '23 12:05 kdgutier