torchtitan icon indicating copy to clipboard operation
torchtitan copied to clipboard

Process got stuck when trying to optimize different groups of parameters using different types of data

Open Yangyi-Chen opened this issue 5 months ago • 3 comments

Hi,

I'm adding a new linear projection layer (nn.Linear) to the original Llama3 architecture to process a new type of data. During training, I use two types of data (language-only and multimodal data). When using language-only data, the whole Llama-3 parameters will be finetuned. When using multimodal data, the whole Llama-3 parameters and the parameters in the added linear layer will be finetuned. Both of them can function well independently.

However, when I combined these two types of data to do multi-task learning, the process just got stuck without any further information. Doesn't the current torchtitan support this kind of function? Thanks.

### Tasks

Yangyi-Chen avatar Sep 18 '24 22:09 Yangyi-Chen