darts GPU support for XGBModel in darts

Hello there

I am currently a user of several darts tools. I am working in creating forecasts for a local fast-food chain with 200 restaurants. I want to forecast 20 items in each restaurant. I use TimeSeries.from_group_dataframe() to group each of the items from each restaurant, then I apply some features and then I use XGBModel to create a recursive forecasting. The model is working great, except for the hyperparameter tunning which I could not perform automatically and instead I had to use a for loop. This makes the optimization process slow (well, it is actually also due to the huge amount of data). I am exploring ways to improve speed of optimization and found that using gpu is faster. I dont have a computer with gpu but I plan to get one. But first I would like to check if the XGBModel in darts can be used with the 'gpu-hist' parameter to use gpu. I appreciate your comments. Thanks in advance.

Nov 28 '23 01:11 wgutierr

Hi @wgutierr,

The kwargs argument of XGBModel constructor (and fit()) is passed to the underlying XGBoost library implementation. I haven't tested it personally but yes, the GPU training should be supported if you provide the proper device value when creating the model.

Letting this issue open, make sure to share your experience once you have access to a GPU :)

Nov 28 '23 13:11 madtoinou

Thanks, will try it and let you know.

Dec 04 '23 19:12 wgutierr

Confirmed this, but it didn't use the device parameter. Instead, instantiating the model as follows worked for me:

This only worked for single GPU though - A100 80GB is what I used.

model = XGBModel(tree_method='gpu_hist', **model_kwargs)
backtest = model.historical_forecasts(series_transformed, 
                                                               start=training_cutoff, 
                                                               forecast_horizon=forecast_horizon,
                                                                )

Confirmed GPU usage in nvidia-smi with Driver 525, CUDA 12.3, xgboost 1.7.6 python 3.10 in the nvidia/pytorch:23.12-py3 NGC container

If the XGBModel is instantiated like so:

model = XGBModel(device='cuda', tree_method='gpu_hist', **model_kwargs)

Then I get the following warning a lot:

WARNING: /opt/rapids/src/xgboost/src/learner.cc:767: 
Parameters: { "device" } are not used.

@beckernick for viz, haven't checked for OOTB support for cuML yet.

Jan 18 '24 03:01 esnvidia

@beckernick and Darts team

Confirmed cuML models work when wrapped in a RegressionModel. Did not do comprehensive testing, but saw GPU usage in the below example:

from darts.models import RegressionModel
from cuml import LinearRegression as lr    # so as not to pollute namespcae
model_wrapper = RegressionModel(lags=12,
                                lags_future_covariates=(12, 1),
                                model=lr(),
                                output_chunk_length=3,
                            )

backtest_of_wrapped_model = model_wrapper.historical_forecasts(series_transformed, 
                                                               start=training_cutoff, 
                                                               forecast_horizon=forecast_horizon,
                                                               future_covariates=time_covariates_transformed,
                                                               )

Looks like it also handled H2D and D2H transfer, but it's not perfect (see below). As I am able to also directly plot backtest_of_wrapped_model as usual, and type(backtest_of_wrapped_model.values()) is numpy.ndarray!

Some of these models had fairly high GPU utilization in my use case but I felt like they could have trained faster. Didn't profile.

I did get plenty of warnings for LR: Changing solver to 'svd' as this is the only solver that support multiple targets currently.

cuML LinearRegression, Ridge, SVR, KNeighborsRegressor worked

For KNeighborsRegressor I was getting Unused keyword parameter: n_jobs during cuML estimator initialization seems like this is coming from Darts.

cuML ExponentialSmoothing failed due to an API difference as it doesn't have a predict method. As an example, I had the following (series_transformed is multivariate, so I select just the target series)

from cuml import ExponentialSmoothing as expsmooth
model_wrapper = RegressionModel(lags=12,
                                lags_future_covariates=(12, 1),
                                model=expsmooth(series_transformed[target], seasonal='add'),
                                output_chunk_length=3,
                            )

backtest_of_wrapped_model = model_wrapper.historical_forecasts(series_transformed[target], 
                                                               start=training_cutoff, 
                                                               forecast_horizon=forecast_horizon,
                                                               future_covariates=time_covariates_transformed,
                                                               )

cuML ARIMA also had an error:

#input
from cuml import ARIMA as arima
model_wrapper = RegressionModel(lags=12,
                                lags_future_covariates=(12, 1),
                                model=arima(series_transformed[target].values()),
                                output_chunk_length=3,
                            )

File /usr/local/lib/python3.10/dist-packages/joblib/parallel.py:1863, in Parallel.__call__(self, iterable)
   1861     output = self._get_sequential_output(iterable)
   1862     next(output)
-> 1863     return output if self.return_generator else list(output)
   1865 # Let's create an ID that uniquely identifies the current call. If the
   1866 # call is interrupted early and that the same instance is immediately
   1867 # re-used, this id will be used to prevent workers that were
   1868 # concurrently finalizing a task from the previous call to run the
   1869 # callback.
   1870 with self._lock:

File /usr/local/lib/python3.10/dist-packages/joblib/parallel.py:1792, in Parallel._get_sequential_output(self, iterable)
   1790 self.n_dispatched_batches += 1
   1791 self.n_dispatched_tasks += 1
-> 1792 res = func(*args, **kwargs)
   1793 self.n_completed_tasks += 1
   1794 self.print_progress()

File /usr/local/lib/python3.10/dist-packages/sklearn/utils/fixes.py:117, in _FuncWrapper.__call__(self, *args, **kwargs)
    115 def __call__(self, *args, **kwargs):
    116     with config_context(**self.config):
--> 117         return self.function(*args, **kwargs)

File /usr/local/lib/python3.10/dist-packages/sklearn/multioutput.py:46, in _fit_estimator(estimator, X, y, sample_weight, **fit_params)
     45 def _fit_estimator(estimator, X, y, sample_weight=None, **fit_params):
---> 46     estimator = clone(estimator)
     47     if sample_weight is not None:
     48         estimator.fit(X, y, sample_weight=sample_weight, **fit_params)

File /usr/local/lib/python3.10/dist-packages/sklearn/base.py:87, in clone(estimator, safe)
     79             raise TypeError(
     80                 "Cannot clone object '%s' (type %s): "
     81                 "it does not seem to be a scikit-learn "
     82                 "estimator as it does not implement a "
     83                 "'get_params' method." % (repr(estimator), type(estimator))
     84             )
     86 klass = estimator.__class__
---> 87 new_object_params = estimator.get_params(deep=False)
     88 for name, param in new_object_params.items():
     89     new_object_params[name] = clone(param, safe=False)

File /usr/local/lib/python3.10/dist-packages/cuml/internals/api_decorators.py:190, in _make_decorator_function.<locals>.decorator_function.<locals>.decorator_closure.<locals>.wrapper(*args, **kwargs)
    188         ret = func(*args, **kwargs)
    189     else:
--> 190         return func(*args, **kwargs)
    192 return cm.process_return(ret)

File arima.pyx:591, in cuml.tsa.arima.ARIMA.get_params()

NotImplementedError: ARIMA is unable to be cloned via `get_params` and `set_params`.

KernelRidge failed with the following error:

File /usr/local/lib/python3.10/dist-packages/darts/models/forecasting/forecasting_model.py:388, in ForecastingModel._predict_wrapper(self, n, series, past_covariates, future_covariates, predict_likelihood_parameters, **kwargs)
    386 if self.supports_likelihood_parameter_prediction:
    387     add_kwargs["predict_likelihood_parameters"] = predict_likelihood_parameters
--> 388 return self.predict(n=n, **add_kwargs, **kwargs)

File /usr/local/lib/python3.10/dist-packages/darts/models/forecasting/regression_model.py:1005, in RegressionModel.predict(self, n, series, past_covariates, future_covariates, num_samples, verbose, predict_likelihood_parameters, show_warnings, **kwargs)
   1002 X = np.concatenate(X_blocks, axis=0)
   1004 # X has shape (n_series * n_samples, n_regression_features)
-> 1005 prediction = self._predict_and_sample(
   1006     X, num_samples, predict_likelihood_parameters, **kwargs
   1007 )
   1008 # prediction shape (n_series * n_samples, output_chunk_length, n_components)
   1009 # append prediction to final predictions
   1010 predictions.append(prediction[:, last_step_shift:])

File /usr/local/lib/python3.10/dist-packages/darts/models/forecasting/regression_model.py:1044, in RegressionModel._predict_and_sample(self, x, num_samples, predict_likelihood_parameters, **kwargs)
   1036 def _predict_and_sample(
   1037     self,
   1038     x: np.ndarray,
   (...)
   1041     **kwargs,
   1042 ) -> np.ndarray:
   1043     """By default, the regression model returns a single sample."""
-> 1044     prediction = self.model.predict(x, **kwargs)
   1045     k = x.shape[0]
   1046     return prediction.reshape(k, self.pred_dim, -1)

File /usr/local/lib/python3.10/dist-packages/sklearn/multioutput.py:253, in _MultiOutputEstimator.predict(self, X)
    247     raise ValueError("The base estimator should implement a predict method")
    249 y = Parallel(n_jobs=self.n_jobs)(
    250     delayed(e.predict)(X) for e in self.estimators_
    251 )
--> 253 return np.asarray(y).T

File cupy/_core/core.pyx:1475, in cupy._core.core._ndarray_base.__array__()

TypeError: Implicit conversion to a NumPy array is not allowed. Please use `.get()` to construct a NumPy array explicitly.

Haven't checked the rest, but figured this was a good start.

Jan 18 '24 05:01 esnvidia

Thanks @esnvidia for sharing your experience. It seems like a lot of the error come from sklearn itself, but it's great to see that many models from cuML could be used as-is 🚀 .

Since the original question of the issue was for XGBoost and a solution was described, I am closing this issue but feel free to open a new one to discuss about the compatibility & support of cuML in Darts.

Jun 18 '24 06:06 madtoinou

darts darts copied to clipboard

GPU support for XGBModel in darts

darts
darts copied to clipboard