transformers icon indicating copy to clipboard operation
transformers copied to clipboard

Whisper: fix asr pipeline with seq2seq assistant model

Open gante opened this issue 1 month ago • 1 comments

What does this PR do?

Fixes #29869 Fixes #30407 Fixes #30611 (related PR: #30637)

The ASR pipeline was preparing the encoder outputs before generate (and not passing input_features), but that's not needed: the exact same preparation is done inside generate, as it is a hard requirement to generate with encoder-decoder models.

However, by not passing input_features, it was blocking the proper use of encoder-decoder assistants that used a different encoder output shape (= decoder input shape), and thus needed the inputs to run their own encoding step.

Contrarily to #30637, this fix relies on lowering the complexity of our codebase 👼

gante avatar May 09 '24 12:05 gante

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.