neuralforecast icon indicating copy to clipboard operation
neuralforecast copied to clipboard

Support df_val argument in .fit() for Hyperparameter Tuning Purpose

Open jluo41 opened this issue 5 months ago • 0 comments
trafficstars

Description

Feature Request: Add df_val to .fit() for validation on disjoint unique_ids during hyperparameter tuning

In my use case, each unique_id represents a short time series — a 26-hour CGM sequence where: - The first 24 hours (288 steps at 5-minute intervals) serve as input - The next 2 hours (24 steps) are the prediction target Each unique_id corresponds to one input–output pair, meaning intra-series validation via val_size is not applicable.

--- ### Problem When using models like AutoPatchTST, I want to tune hyperparameters via Optuna, but based on generalization to a separate set of unique_ids (e.g., unseen patients). Currently:

  • .fit() only supports val_size, which splits within each time series
  • I cannot pass a df_val with new unique_ids to guide early stopping or hyperparameter selection

Request

Enable:

nf.fit(df_train, df_val=df_valid)`` 

Where df_train and df_val contain disjoint unique_ids.

Use df_val for:

  • Early stopping

  • Evaluation for Optuna tuning

  • Model selection


Why This Matters

This is essential for:

  • Short series use cases (like CGM, wearable sensors)

  • Fair and generalizable models across individuals

  • Sample-level time series where each series is a single training example


Current Workaround

I currently fit only on df_train, then evaluate on df_valid manually after .predict(). But this breaks integration with automated tuning and early stopping logic.

Would love to see native support for df_val in future releases. Happy to share reproducible examples or help design the API.

Use case

No response

jluo41 avatar Jun 21 '25 05:06 jluo41