Consistency_LLM Model collapse

Model collapse

Open TrueNobility303 opened this issue 9 months ago • 5 comments

Hi!

Dear authors,

Thanks for your excellent work! I train the CLLM model on GSM8k with Abel-7B-001 as the teacher model, using the dataset cleaned_gsm8k_jacobi dataset you provided on hugginface, and run the train_cllm.sh with n_token_seq_size=16. Now the training process has been completed 1/5, but the checkpoint does not look good. It seems that the model has collapsed to output the same tokens, as shown in the following picture.

Is this phenomenon normal? Did I use the training scripts correctly?

I would greatly appreciate it if you could help me.

Best regards.

May 22 '24 08:05 TrueNobility303

Consistency_LLM Consistency_LLM copied to clipboard

Model collapse

Consistency_LLM
Consistency_LLM copied to clipboard