torchtitan Process got stuck when trying to optimize different groups of parameters using different types of data

Process got stuck when trying to optimize different groups of parameters using different types of data

Open Yangyi-Chen opened this issue 5 months ago • 3 comments

Hi,

I'm adding a new linear projection layer (nn.Linear) to the original Llama3 architecture to process a new type of data. During training, I use two types of data (language-only and multimodal data). When using language-only data, the whole Llama-3 parameters will be finetuned. When using multimodal data, the whole Llama-3 parameters and the parameters in the added linear layer will be finetuned. Both of them can function well independently.

However, when I combined these two types of data to do multi-task learning, the process just got stuck without any further information. Doesn't the current torchtitan support this kind of function? Thanks.

### Tasks

Sep 18 '24 22:09 Yangyi-Chen

torchtitan torchtitan copied to clipboard

Process got stuck when trying to optimize different groups of parameters using different types of data

torchtitan
torchtitan copied to clipboard