LetsGoFir

Results 19 comments of LetsGoFir

> I will for sure come back to let you know if I can solve it. hello how's it going?

> What is the command you use? Could you also try cleaning your hugging face cache by `rm -rf ~/.cache/huggingface`? Still mismatch dimention, and this is my command @merrymercy ```...

> @LetsGoFir play around the gradient_accumulation_steps. instead of 16 try smaller steps. also see [issue 540](https://github.com/lm-sys/FastChat/issues/540) I am trying to get vicuna weights, not finetuning

> Try it this way, download llama-13b0hf from https://huggingface.co/decapoda-research/llama-13b-hf not worked for me

> Hi, > > We train our image version model on the 8 A100 GPU, with 3 epochs in one day. > > We use AdamW optimizer, with 1% training...

> We observed different tokenizer behavior on transformer>=4.34.0. This may affect model training and we are checking. then how do you do the resume? Since the learning rate not saved