llama-recipes
llama-recipes copied to clipboard
Llama 3.2 Vision Models Fine-Tuning Recipe
🚀 The feature, motivation and pitch
Notice that in the original paper "The Llama 3 Herd of Models", section 7.5.2 on vision model SFT states that only the vision encoder and image adapter weights should be updated, while the LLM weights remain frozen.
However, in the fine-tuning recipe for vision models that you provided, it seems like all LLM weights are being tuned. Is this an oversight, or are you working on updating the training script to only tune the vision encoder and image adapter?
Alternatives
No response
Additional context
No response