blog
blog copied to clipboard
Whisper fine tuning - which layers are trained?
Thanks for the detailed blog post on "Fine-Tune Whisper For Multilingual ASR with 🤗 Transformers". After going through the article and also created a fine-tune model for my own application, I have the following questions, I hope someone can help:
- When using 🤗 Trainer with the Seq2SeqTrainingArguments, which layer(s) are trained?
- only the linear output layer
- last two layers + last transformer block
- all layers
- Is it possible to specify which layers to train and which to freeze? Some code samples would be appreciated.