Whisper: fix asr pipeline with seq2seq assistant model

Open gante opened this issue 1 month ago • 1 comments

What does this PR do?

Fixes #29869 Fixes #30407 Fixes #30611 (related PR: #30637)

The ASR pipeline was preparing the encoder outputs before generate (and not passing input_features), but that's not needed: the exact same preparation is done inside generate, as it is a hard requirement to generate with encoder-decoder models.

However, by not passing input_features, it was blocking the proper use of encoder-decoder assistants that used a different encoder output shape (= decoder input shape), and thus needed the inputs to run their own encoding step.

Contrarily to #30637, this fix relies on lowering the complexity of our codebase 👼

May 09 '24 12:05 gante

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

May 09 '24 12:05 HuggingFaceDocBuilderDev

transformers transformers copied to clipboard

Whisper: fix asr pipeline with seq2seq assistant model

What does this PR do?

transformers
transformers copied to clipboard