evalml
evalml copied to clipboard
Support monotonic constraints
We should allow a user to specify if they want to impose monotonic constraints on any of the variables that are feeding into their model.
Not all modeling approaches support this. Xgboost does
- https://xgboost.readthedocs.io/en/latest/tutorials/monotonic.html
API needs
- how does a user to specify this?
- how does the automl know which pipelines support this option?
Great. I think LightGBM supports this too.
This could be a boolean option called monotonic
specified as an automl search option. We could add a supports_monotonic
boolean tag to pipelines (PipelineBase
, perhaps) which defaults to false, then override that to true in specific subclasses. Then, if monotonic
was specified as true in the automl configuration, the get_pipelines
method could be updated to only return xgboost and other models with support for monotonicity constraints.
regarding the api, is monotonic applied per variable or globally across all variables?
Looks like catboost also has some support for this and also provides demo datasets. Both XGBoost and CatBoost use a similar API where the user passes in a list of (-1, 0, 1) for each feature to specify monotonic relationships. Could follow this approach (or if user just passes in an int, we assume that it is applied globally)?
This would be great to circle back to once the automl strategy project #272 is complete. In particular, we could define some pipelines as supporting monotonicity constraints, and if an automl search is configured with this constraint enabled, we only try those pipelines.