transformers icon indicating copy to clipboard operation
transformers copied to clipboard

Remove CLI spams with Whisper FeatureExtractor

Open qmeeus opened this issue 2 years ago • 1 comments

What does this PR do?

Whisper feature extractor representation includes the MEL filters, a list of list that is represented as ~16,000 lines. This needlessly spams the command line. I added a __repr__ method that replaces this list with a string <array of shape (80, 201)>

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR. @ArthurZucker

qmeeus avatar Jan 23 '23 18:01 qmeeus

The documentation is not available anymore as the PR was closed or merged.

Hey, thanks for the contribution! I agree with you, wee should not save the filters as they just depend on the parameters with which they were created, which is why I would be in favor of simply adding the following :

    def to_dict(self) -> Dict[str, Any]:
        """
        Serializes this instance to a Python dictionary.

        Returns:
            `Dict[str, Any]`: Dictionary of all the attributes that make up this configuration instance.
        """
        output = copy.deepcopy(self.__dict__)
        output["feature_extractor_type"] = self.__class__.__name__
        if "mel_filters" in output:
            del output["mel_filters"]
        return output

Also cc @sanchit-gandhi this seems very logitcal to me

ArthurZucker avatar Feb 02 '23 15:02 ArthurZucker

Yes indeed, I think this solution is better

qmeeus avatar Feb 06 '23 18:02 qmeeus

For the remaining failing test, I suggest you rebase on main 😉

ArthurZucker avatar Feb 07 '23 07:02 ArthurZucker

You can also modify the test to make the CI go green 😉

ArthurZucker avatar Feb 09 '23 07:02 ArthurZucker