blog icon indicating copy to clipboard operation
blog copied to clipboard

Whisper fine tuning - which layers are trained?

Open chungvle opened this issue 1 year ago • 0 comments

Thanks for the detailed blog post on "Fine-Tune Whisper For Multilingual ASR with 🤗 Transformers". After going through the article and also created a fine-tune model for my own application, I have the following questions, I hope someone can help:

  1. When using 🤗 Trainer with the Seq2SeqTrainingArguments, which layer(s) are trained?
  • only the linear output layer
  • last two layers + last transformer block
  • all layers
  1. Is it possible to specify which layers to train and which to freeze? Some code samples would be appreciated.

chungvle avatar Jun 12 '24 21:06 chungvle