sklearn-pandas icon indicating copy to clipboard operation
sklearn-pandas copied to clipboard

bug fixed: Unexpected Dropping of columns

Open namanmistry opened this issue 1 year ago • 0 comments

The bug was indicated in the Unexpected Dropping of columns issue where the user there was no effect of passing the drop_cols argument in the DataFrameMapper and the output columns were also wrong.

I have modified the _build(self, X=None): function inside DataFrameMapper class and added code to filter the columns based on self.drop_cols variable.

Previous build function:

 def _build(self, X=None):
        """
        Build attributes built_features and built_default.
        """
        if isinstance(self.features, list):
            self.built_features = [
                _build_feature(*f, X=X) for f in self.features
            ]
        else:
            self.built_features = _build_feature(*self.features, X=X)
        self.built_default = _build_transformer(self.default)

The modified function:


 def _build(self, X=None):
        """
        Build attributes built_features and built_default.
        """

        if isinstance(self.features, list):
 
            filtered_list = []
            for obj in self.features:
                if isinstance(obj[0], list):
                    new_cols = [col for col in obj[0] if col not in self.drop_cols]
                   
                    new_tuple = tuple([new_cols] + list(obj[1:]))
                    filtered_list.append(new_tuple)
                else:
                    if obj[0] not in self.drop_cols:
                        filtered_list.append(obj)
            self.features = filtered_list

            self.built_features = [_build_feature(*f, X=X) for f in self.features]
        else:
            self.built_features = _build_feature(*self.features, X=X)
        self.built_default = _build_transformer(self.default)

This will filter the columns based on the self.drop_cols variable and will get the filtered columns. I am a beginner in open source contribution and this is my first pull request. Please feel free to give me any suggestions.

namanmistry avatar May 14 '23 07:05 namanmistry