evalml API to view the features provided as input to each component in a pipeline

Is this the best way to do it?

pipeline = clf.best_pipeline
pipeline.input_feature_names[pipeline.estimator.name]

We might want to think through improving this API or at least documenting this is how to do it

Nov 20 '19 22:11 kmax12

@kmax12 , could you explain what the goal is here? I think I don't have the right context yet to understand the motivation. Why does the estimator need the feature names?

And did you mean for input_feature_names to be a function here? It appears to be some sort of getter method, in which case it's just going to return a string; I thought you were looking for a setter.

Nov 22 '19 16:11 dsherry

@dsherry In our pipeline class (PipelineBase), we store the names of the input features to each step of the pipeline in a dictionary called self.input_feature_names when we call fit(), so pipeline.input_feature_names[pipeline.estimator.name] will actually return a pd.DataFrame object.

I think the idea of storing the feature names passed to the estimator is that it allows us to know how many features / which features were used in training the model.

Nov 22 '19 16:11 angela97lin

@angela97lin thanks, that makes sense.

Notes from discussing this with Max just now:

This originally came from user feedback. The user wanted to see the list of features which were provided to the estimator's fit method. This could be a subset of the full list of features given to the pipeline if the pipeline has a feature selection component.

This ticket tracks the following decision: do we keep things as-is RE input features, or do we update the API for accessing each component's input features to make it easier or more clear?

Dec 09 '19 17:12 dsherry

This issue has sat around for a while. Let's have it track designing APIs which do the following:

Allow users to access the feature names passed as inputs to each component after a pipeline has been trained
Allow users to access the feature values passed as inputs during a pipeline evaluation. This could be done simply by supporting slicing component graphs and then evaluating the sliced fragment on some data and returning the output.

May 08 '20 22:05 dsherry

Our component graph now supports two things which help here

Compute features provided to estimator
View the output of each component in the graph

We don't actually have point 2 exposed in an API.

Let's let this issue now track exposing a way to access each component's output from a component graph evaluation.

Jun 18 '21 19:06 dsherry

evalml evalml copied to clipboard

API to view the features provided as input to each component in a pipeline

evalml
evalml copied to clipboard