enhancement_proposals icon indicating copy to clipboard operation
enhancement_proposals copied to clipboard

SLEP016: parameter spaces on estimators

Open jnothman opened this issue 4 years ago • 9 comments

jnothman avatar Nov 30 '21 11:11 jnothman

Ready for review!

jnothman avatar Mar 18 '22 00:03 jnothman

Thanks for the prompt review @adrinjalali. At this point I'm not sure what to update based on your comments, but I am curious to understand better what you find attractive about the Searchgrid API relative to this.

jnothman avatar Mar 19 '22 10:03 jnothman

Note to self: an alternative here might be a way to alias deep parameters at the metaestimator level...

jnothman avatar Mar 23 '22 05:03 jnothman

Thanks for the review @amueller. I hope I find clarity of mind and time soon to make edits.

Over all, do you feel like we should have separate set_grid and set_distrib, or combined set_param_space whose rvs will be rejected by a grid search?

One thought that comes to mind is how to make this something that usefully integrates with external hyperparameter optimisation tools. I mean it may be their responsibility to do so, but I wonder how we assist in their support.

jnothman avatar Mar 27 '22 13:03 jnothman

I would like to propose an alternative approach. Core idea is to use mapping between an object and space, and then iterate through given pipeline and build the parameter_grid.

    pipe = make_pipeline(DecisionTree())
    object_param_grid = {DecisionTree: {"max_depth": [1, 2]}
    create_param_grid(pipe, object_param_grid) == {'decisiontree__max_depth': [1, 2]}

This is useful when the pipeline is not complex and there are no instances that use different params.

It is possible to change this a bit and use instance_to_space mapping.

    tree = DecisionTree()
    pipe = make_pipeline()
    object_param_grid = {tree: {"max_depth": [1, 2]}
    create_param_grid(pipe, object_param_grid) == {'decisiontree__max_depth': [1, 2]}

This would add more granularity, but would requite instantiating transformers/estimators before generating the grid.

Here is a source that achieves object_to_paramgrid mapping.

def _get_steps(object_, ):
    """Retrieves steps/transformers from Pipeline, FeatureUnion or ColumnTransformer objects"""
    object_to_step_name = {
        Pipeline: "steps", 
        FeatureUnion: "transformer_list",
        ColumnTransformer: "transformers",
    }

    if step_name := object_to_step_name.get(object_.__class__):
        steps = [steps for steps in getattr(object_, step_name)]
        return steps
    
def resolve(parent_object, object_param_grid, path="", param_grid=None):
    if param_grid is None:
        param_grid = {}

    steps = _get_steps(parent_object)
    if not steps:
        return param_grid

    for child_object_path, child_object, *_ in steps:
        full_path = '__'.join([path, child_object_path]).strip("__")
        child_object_class_name = child_object.__class__

        if object_params := object_param_grid.get(child_object_class_name):
            flattened_param_grid = {f"{full_path}__{k}": v for k, v in object_params.items()}
            param_grid = flattened_param_grid | param_grid

        param_grid = resolve(child_object, object_param_grid, path=path+child_object_path, param_grid=param_grid)
    return param_grid

aidiss avatar Jul 21 '23 13:07 aidiss

I like several aspects of that proposal, @aidiss, though I'm not sure I'd worry about supporting classes in the first version. Specifically:

  • The entire parameter grid can be defined explicitly in one place in user code, or can be assembled from multiple places quite easily
  • The effect of this change could be localised to *SearchCV implementations, with such a dict being a drop-in replacement for the current param_grid or param_distributions. Therefore the change would not require a SLEP.
  • It may get rid of the need for a specialised design to handle RVs vs grids.

One thing I'm not sure about is how to manage errors. If, for instance, the user were to clone the estimator, we'd suddenly have no grid for it if it were defined in terms of instances, and that would not be obvious to the user. On the other hand, I think this is a flaw in all or several the proposed designs.

I'll otherwise need to think about whether there are aspects of searchgrid capability that could not be reproduced with this approach.

jnothman avatar Dec 27 '23 03:12 jnothman

I've recalled @aidiss that your suggestion is pretty much the same thing that I proposed in https://github.com/scikit-learn/scikit-learn/pull/21784, although I admit that the .set interface is not particularly pleasant.

jnothman avatar Dec 29 '23 00:12 jnothman

I think I prefer the Searchgrid design, but I'm taking your word on people preferring this design to that one.

I wonder if you meant the GridFactory design, @adrinjalali? Searchgrid secretly sets attributes on estimators, so it's not a great functional/OOP design.

jnothman avatar Dec 29 '23 01:12 jnothman

I'd be keen to merge this and then have the team decide if it's the right proposal, or if we should go down one of the SLEP-free paths (e.g. GridFactory, or a lighter API variant of it like above).

We have a lot of the implementation already, and mostly need to come to a decision.

@adrinjalali interested in reviewing for merge?

jnothman avatar Dec 29 '23 01:12 jnothman