transformers
transformers copied to clipboard
Whisper: fix asr pipeline with seq2seq assistant model
What does this PR do?
Fixes #29869 Fixes #30407 Fixes #30611 (related PR: #30637)
The ASR pipeline was preparing the encoder outputs before generate
(and not passing input_features
), but that's not needed: the exact same preparation is done inside generate
, as it is a hard requirement to generate with encoder-decoder models.
However, by not passing input_features
, it was blocking the proper use of encoder-decoder assistants that used a different encoder
output shape (= decoder input shape), and thus needed the inputs to run their own encoding step.
Contrarily to #30637, this fix relies on lowering the complexity of our codebase 👼
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.