statsforecast Slow first training with cross-validation

Describe the bug Related to #84

We implemented the statsforecast integration in pycaret using the sktime adapter. On implementing cross-validation, we noticed that the first model training is slow (for all folds in the cross-validation) - see model2 here. Does the numba compilation happen in each fold during the first model build (maybe because all folds are run in parallel)? This will be helpful since training a single time series model with cross-validation will be a common use case.

Subsequent models seem to train fast as expected (model 3 in the above code example)

To Reproduce See https://nbviewer.org/gist/ngupta23/4e2e90183c7f08555df3cfebe3df9756

Expected behavior Could the Cross-validation on the first trained model be made faster?

Desktop (please complete the following information): System: python: 3.7.13 (default, Apr 24 2022, 01:04:09) [GCC 7.5.0] executable: /usr/bin/python3 machine: Linux-5.4.188+-x86_64-with-Ubuntu-18.04-bionic

Additional context In the above code, if I change exp to run using 1 core only, then statsforecast runs much faster (25 seconds for 5 folds compared to 46 seconds earlier). So, it seems like the first training does numba compilation for each fold (whe cv is run in parallel) and hence takes longer.

#### default auto_arima engine is pmdarima for now ----
exp.setup(data=data, fh=12, session_id=42, fold=5, n_jobs=1)

Aug 02 '22 22:08 ngupta23

@all-contributors please add @ngupta23 for bug

Aug 03 '22 01:08 mergenthaler

@mergenthaler

I've put up a pull request to add @ngupta23! :tada:

Aug 03 '22 01:08 allcontributors[bot]

@mergenthaler @FedericoGarza Do you know if there is a solution for this problem. If we can find a solution, we can advertise the statsforecast integration with pycaret (since it is already implemented).

FYI... I tried building a dummy model before doing the cross validation with multiple cores thinking that the multiple cores could then utilize the complied numba code by the dummy model, but it does not seem to help.

Any inputs here would be appreciated. Thanks!

Nov 26 '22 19:11 ngupta23

statsforecast statsforecast copied to clipboard

Slow first training with cross-validation

statsforecast
statsforecast copied to clipboard