Cynthia Chen
Cynthia Chen
I encountered the same error when I was converting the Llama 2 model. Using ``transformers==4.38`` solved this problem.
Hi, Thanks a lot for your interests in our work! For running imitation learning without any pertaining / joint training, here is one example: ``` CUDA_VISIBLE_DEVICES=0 xvfb-run -a python src/il_representations/scripts/pretrain_n_adapt.py...
The code looks good to me! Just that when I try to plot the lr curve, it seems the linear scaling part's minimum lr is `eta_min` and the cosine part's...