PaddleTS icon indicating copy to clipboard operation
PaddleTS copied to clipboard

TSDataset的split()方法无法和NHiTSModel结合使用

Open akari0216 opened this issue 1 year ago • 3 comments

尝试在aistudio上使用NHiTSModel模型,代码如下: ` from paddlets.datasets.repository import get_dataset

dataset = get_dataset("UNI_WTH") train_dataset, val_test_dataset = dataset.split(0.7) val_dataset, test_dataset = val_test_dataset.split(0.5)

from paddlets.models.forecasting import NHiTSModel model = NHiTSModel( in_chunk_len = 7 * 24, out_chunk_len = 24, max_epochs=100 )

model.fit(train_dataset, val_dataset) `

然而报错:

ValueError Traceback (most recent call last) /tmp/ipykernel_4864/1383395577.py in ----> 1 model.fit(train_dataset, val_dataset)

~/external-libraries/paddlets/models/forecasting/dl/paddle_base_impl.py in fit(self, train_tsdataset, valid_tsdataset) 344 self._check_multi_tsdataset(valid_tsdataset) 345 train_dataloader, valid_dataloaders = self._init_fit_dataloaders(train_tsdataset, valid_tsdataset) --> 346 self._fit(train_dataloader, valid_dataloaders) 347 348 def _fit(

~/external-libraries/paddlets/models/forecasting/dl/paddle_base_impl.py in _fit(self, train_dataloader, valid_dataloaders) 374 # Predict for each eval set. 375 for eval_name, valid_dataloader in zip(valid_names, valid_dataloaders): --> 376 self._predict_epoch(eval_name, valid_dataloader) 377 378 # Call the on_epoch_end method of each callback at the end of the epoch.

~/external-libraries/paddlets/models/forecasting/dl/paddle_base_impl.py in _predict_epoch(self, name, loader) 487 list_y_score.append(scores) 488 y_true, scores = np.vstack(list_y_true), np.vstack(list_y_score) --> 489 metrics_logs = self._metric_container_dict[name](y_true, scores) 490 self._history._epoch_metrics.update(metrics_logs) 491 self._network.train()

~/external-libraries/paddlets/metrics/metrics.py in call(self, y_true, y_score) 393 logs = {} 394 for metric in self._metrics: --> 395 res = metric.metric_fn(y_true, y_score) 396 logs[self._prefix + metric._NAME] = res 397 return logs

~/external-libraries/paddlets/metrics/utils.py in wrapper(obj, y_true, y_score, **kwargs) 40 y_true = np.reshape(y_true, (batch_nd_true, -1)) 41 y_score = np.reshape(y_score, (batch_nd_score, -1)) ---> 42 return func(obj, y_true, y_score, **kwargs) 43 return wrapper 44

~/external-libraries/paddlets/metrics/metrics.py in metric_fn(self, y_true, y_score) 93 """ 94 ---> 95 return metrics.mean_absolute_error(y_true, y_score) 96 97

/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/sklearn/utils/validation.py in inner_f(*args, **kwargs) 61 extra_args = len(args) - len(all_args) 62 if extra_args <= 0: ---> 63 return f(*args, **kwargs) 64 65 # extra_args > 0

/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/sklearn/metrics/_regression.py in mean_absolute_error(y_true, y_pred, sample_weight, multioutput) 181 """ 182 y_type, y_true, y_pred, multioutput = _check_reg_targets( --> 183 y_true, y_pred, multioutput) 184 check_consistent_length(y_true, y_pred, sample_weight) 185 output_errors = np.average(np.abs(y_pred - y_true),

/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/sklearn/metrics/_regression.py in _check_reg_targets(y_true, y_pred, multioutput, dtype) 88 check_consistent_length(y_true, y_pred) 89 y_true = check_array(y_true, ensure_2d=False, dtype=dtype) ---> 90 y_pred = check_array(y_pred, ensure_2d=False, dtype=dtype) 91 92 if y_true.ndim == 1:

/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/sklearn/utils/validation.py in inner_f(*args, **kwargs) 61 extra_args = len(args) - len(all_args) 62 if extra_args <= 0: ---> 63 return f(*args, **kwargs) 64 65 # extra_args > 0

/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator) 719 if force_all_finite: 720 _assert_all_finite(array, --> 721 allow_nan=force_all_finite == 'allow-nan') 722 723 if ensure_min_samples > 0:

/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/sklearn/utils/validation.py in _assert_all_finite(X, allow_nan, msg_dtype) 104 msg_err.format 105 (type_err, --> 106 msg_dtype if msg_dtype is not None else X.dtype) 107 ) 108 # for object dtype data, we only check for NaNs (GH-13254)

ValueError: Input contains NaN, infinity or a value too large for dtype('float32').

查看了一下,TSDataset的split()方法会对非target部分用nan进行mask, 该方式分割的数据在其他模型上均能正常跑通(如TCN,MLP等)。故请问该问题是否可以解决呢?谢谢! 环境: paddlepaddle-gpu 2.4.0.post112 paddlets 1.1.0

akari0216 avatar Jun 14 '23 04:06 akari0216

一样的问题,解决了么

suntao2015005848 avatar Oct 26 '23 08:10 suntao2015005848

您好,我们尝试复现下您的问题。

Sunting78 avatar Mar 21 '24 07:03 Sunting78

您好,该问题和split无关,因为UNI_WTH是只有目标列的数据。是因为NHiTSModel模型的在eval时的模型参数都为nan。请降低paddle版本paddlepaddle-gpu>=2.3.0, <2.4.0

Sunting78 avatar Mar 28 '24 03:03 Sunting78