pycaret
pycaret copied to clipboard
[BUG]: fit_kwargs parameters break pycaret.time_series.compare_models()
pycaret version checks
-
[X] I have checked that this issue has not already been reported here.
-
[X] I have confirmed this bug exists on the latest version of pycaret.
-
[X] I have confirmed this bug exists on the master branch of pycaret (pip install -U git+https://github.com/pycaret/pycaret.git@master).
Issue Description
When passing the window_length and degree parameters to fit_kwargs in the compare_models() function, the function iterates through its models array, but this generates no leaderboard grid output. When I print the result of this function call, I receive a NULL value. This leads me to believe that passing these window_length and degree parameters is breaking the function behavior, though it does execute the code without generating any errors.
In contrast, when passing the window_length and degree parameters to fit_kwargs using the create_model() function, the behavior works exactly as expected. I can print the CV results from create_model() with the declared fit_kwargs values & view the result as well.
Reproducible Example
from pycaret.time_series import *
global_fig_settings = {"renderer": "notebook", "width": 1000, "height": 600}
exp_auto = TSForecastingExperiment()
exp_auto.setup(
data=df_pycaret, target='units', fh=45, enforce_exogenous=False,
fold=10,
fig_kwargs=global_fig_settings,
session_id=42
)
best = exp_auto.compare_models(turbo=True, fit_kwargs={'degree' : 0, 'window_length' : 10})
Expected Behavior
When the fit_kwargs parameter is not explicitly defined with window_length and degree, the function seems to work perfectly, generating the leaderboard grid output. When I print the result of this function call, I receive confirmation of successful completion, such as the following example result: BaseCdsDtForecaster(regressor=HuberRegressor(), sp=7, window_length=7)
Ideally, the compare_model() function would accept all fit_kwargs declarations appropriately, generating both a result and the corresponding leaderboard grid output.
Actual Results
Both the leaderboard grid output prints NULL value & **best** result prints NULL value
Installed Versions
PyCaret required dependencies: pip: 22.1.2 setuptools: 61.2.0 pycaret: 3.0.0.rc3 IPython: 8.2.0 ipywidgets: 7.6.5 tqdm: 4.64.0 numpy: 1.21.5 pandas: 1.4.3 jinja2: 3.0.3 scipy: 1.7.3 joblib: 1.1.0 sklearn: 1.1.2 pyod: Installed but version unavailable imblearn: 0.9.1 category_encoders: 2.5.0 lightgbm: 3.3.2 numba: 0.55.2 requests: 2.27.1 matplotlib: 3.5.2 scikitplot: 0.3.7 yellowbrick: 1.4 plotly: 5.9.0 kaleido: 0.2.1 statsmodels: 0.13.2 sktime: 0.11.4 tbats: Installed but version unavailable pmdarima: 1.8.5 psutil: 5.9.1
This is what I see in the internal base module documentation
fit_kwargs: dict, default = {} (empty dict)
Dictionary of arguments passed to the fit method of the model. The parameters will be applied to all models,
therefore it is recommended to set errors parameter to 'ignore'.
https://github.com/pycaret/pycaret/blob/f93e7087a671458a20bed6dd3a8bcca891034cfc/pycaret/internal/pycaret_experiment/supervised_experiment.py#L469
Can you try setting errors parameter to 'ignore'?
@Yard1 Any idea what is going on here? Is this the right format to pass the fit_kwargs in?
Per August 29th meeting, the current functionality is applied to all models. We will need to reevaluate this functionality after the 3.0.0rc4 release. Maybe better to have model-wise kwargs (dictionary of dictionary). Open Question from @Yard1: How does this affect custom model.
@Yard1 @tvdboom Are we making any changes to this for the RC5 release?
Yes, we should - I'll see if I can get this done, been quite busy :(
5th Nov 2022
This is an enhancement to existing code (not a bug). The current workaround would be create a loop with various models around create_model and pass custom fit_kwargs based on the model in the loop iteration.
We will consider adding an enhancement to the code base later to do this more seamlessly for compare_models