scikit-lego
scikit-lego copied to clipboard
Extra blocks for scikit-learn pipelines.
@MBrouns and @koaning understand it. ``` grid = GridSearch(...) policy = RefitPolicy(filter_steps=, score_steps=, combiner=) best_est = policy.pick_best_est(grid) ``` Types of functions to pass in: ``` veto_policy = hard_filter score_added =...
With anomaly detection, if you have labeled outliers there are two types of models. The first one requires the outliers, although regularly unlabeled. Isolation forests fall in this category. One...
``` from sklearn.model_selection import cross_validate, StratifiedKFold from sklearn.metrics import precision_score, recall_score, make_scorer cross_validate( pipeline, X, y, scoring = { 'eq_op_colour': equal_opportunity_score('colour', positive_target='Yes'), 'eq_op_age': equal_opportunity_score('age', positive_target='Yes'), 'eq_op_sex': equal_opportunity_score('sex', positive_target='Yes'), 'precision': make_scorer(precision_score,...
This PR solves issue #543 and implements `get_feature_names_out` for all relevant transformers in `sklego.preprocessing` (i.e. transformers that do not contain the `TrainOnlyTransformerMixin`). Functionality is implemented through adding the `_ClassNamePrefixFeaturesOutMixin` to...
# Description **Fixes #537** To do a KFold cross validation with a time series data set is a bit more complex than other use cases. Folds have to retain chronological...
This PR adds `get_feature_names_out` functionality for `EstimatorTransformer`. This is a desirable method for `EstimatorTransformer` because its output is used as input for a subsequent estimator. - [x] Implement `get_feature_names_out` for...
Over the years some deprecated code was introduced, for the 0.7.0 release underway it should finally be removed. Here are some of the examples that I found - [ ]...
`get_feature_names_out` is an important component for interpreting scikit-learn `Pipeline` objects. A `get_feature_names_out` call on a `Pipeline` only works if it is implemented for all components in the pipeline, except the...
**Please explain clearly what you'd like to see added.** - To do a KFold cross validation with a time series data set is a bit more complex than other use...
When calling `get_feature_names_out` on `EstimatorTransformer` or a `Pipeline` that contains `EstimatorTransformer` you will get the following error: `AttributeError: 'EstimatorTransformer' object has no attribute 'get_feature_names_out'` Minimal reproducible example: ```python from sklego.meta...