codellama
codellama copied to clipboard
Help Needed: Fine-tuning codellama-7b-Instruct Model for pinescript Programming language
I'm having a problem with fine-tuning the codellama-7b-Instruct model for a programming language. The issue is that the model seems to focus too much on the new dataset , and its performance isn't great on new tasks. It's not just overfitting; sometimes, it doesn't do well on new tasks either.
For example:
Base model
User: Hey There! How are you
Model: I am good. How can I help ?
Finetuned model
User: Hey There! How are you
Model: Yes, I can fix your pinescript code. Provide me your issue?
I've tried increasing the number of training epochs to make sure it learns properly. I've also prepared my dataset carefully according to codellama's requirements. I used LORA and PEFT for finetuning. My dataset has 60,000 chat examples, each with 1000 tokens in the context. To make the model more robust, I overlapped the examples by 25%.
Here are my training settings:
Epochs: 15
Batch Size: 6
Gradient Accumulation Step: 2
Learning Rate: 4e-4
Warmup Ratio: 0.05
Lora r : 32
Lora alpha: 32
Lora dropout: 0.05
Target Modules: ["q_proj", "v_proj", "k_proj", "o_proj", "gate_proj", "up_proj", "down_proj", "lm_head"] (all linear layers are trainable)
Is there any way so I can merge some adapter layers and check its performance rather than merging all layers. I dont want to finetune again since it takes 10-15 days for the finetuning.
Any help or suggestion would highly be appreciated.
@Humza1996 did you able to find a solution to this ?