Dhruv Jain
Dhruv Jain
@crossxxd can you share your training code for mistral 7b base model ? I have been able to put the llama model on the training, however the training is very...
@maohaos2 can you share your code ?
@yuxiang-guo Yeah I did. But I faced a tensor related error while doing the forward pass. I tried debugging through it but couldn't exactly understand what was the problem here.
Hii @BenjaminBossan here's the config file: ``` model: # paths llm_path: "google/gemma-3-4b-it" # LoRA lora: True lora_rank: 8 lora_alpha: 16 lora_dropout: 0.05 target_modules: ["q_proj", "v_proj", "up_proj", "down_proj"] max_seq_len: 4096 end_sym:...
@BenjaminBossan thanks for this, i'll try reproducing this. Will try to find if there is any difference.