codellama Help Needed: Fine-tuning codellama-7b-Instruct Model for pinescript Programming language

Help Needed: Fine-tuning codellama-7b-Instruct Model for pinescript Programming language

Open humza-sami opened this issue 1 year ago • 1 comments

I'm having a problem with fine-tuning the codellama-7b-Instruct model for a programming language. The issue is that the model seems to focus too much on the new dataset , and its performance isn't great on new tasks. It's not just overfitting; sometimes, it doesn't do well on new tasks either.

For example:

Base model

User: Hey There! How are you
Model: I am good. How can I help ?

Finetuned model

User: Hey There! How are you
Model: Yes, I can fix your pinescript code. Provide me your issue?

I've tried increasing the number of training epochs to make sure it learns properly. I've also prepared my dataset carefully according to codellama's requirements. I used LORA and PEFT for finetuning. My dataset has 60,000 chat examples, each with 1000 tokens in the context. To make the model more robust, I overlapped the examples by 25%.

Here are my training settings:

Epochs: 15
Batch Size: 6
Gradient Accumulation Step: 2
Learning Rate: 4e-4
Warmup Ratio: 0.05

Lora r : 32
Lora alpha: 32
Lora dropout: 0.05
Target Modules: ["q_proj", "v_proj", "k_proj", "o_proj", "gate_proj", "up_proj", "down_proj", "lm_head"] (all linear layers are trainable)

Is there any way so I can merge some adapter layers and check its performance rather than merging all layers. I dont want to finetune again since it takes 10-15 days for the finetuning.

Any help or suggestion would highly be appreciated.

Oct 04 '23 10:10 humza-sami

@Humza1996 did you able to find a solution to this ?

Nov 14 '23 09:11 roshan-gopalakrishnan

codellama codellama copied to clipboard

Help Needed: Fine-tuning codellama-7b-Instruct Model for pinescript Programming language

codellama
codellama copied to clipboard