LetsGoFir comments

Results 19 comments of


                                            LetsGoFir

pretrain error

> I will for sure come back to let you know if I can solve it. hello how's it going?

RuntimeError: The size of tensor a (32000) must match the size of tensor b (32001) at non-singleton dimension 0

could you check for this? @merrymercy

RuntimeError: The size of tensor a (32000) must match the size of tensor b (32001) at non-singleton dimension 0

Thanks for your reply! Let me try

RuntimeError: The size of tensor a (32000) must match the size of tensor b (32001) at non-singleton dimension 0

> What is the command you use? Could you also try cleaning your hugging face cache by `rm -rf ~/.cache/huggingface`? Still mismatch dimention, and this is my command @merrymercy ```...

RuntimeError: The size of tensor a (32000) must match the size of tensor b (32001) at non-singleton dimension 0

> @LetsGoFir play around the gradient_accumulation_steps. instead of 16 try smaller steps. also see [issue 540](https://github.com/lm-sys/FastChat/issues/540) I am trying to get vicuna weights, not finetuning

RuntimeError: The size of tensor a (32000) must match the size of tensor b (32001) at non-singleton dimension 0

> Try it this way, download llama-13b0hf from https://huggingface.co/decapoda-research/llama-13b-hf not worked for me

Could you introduce more about model training?

> Hi, > > We train our image version model on the 8 A100 GPU, with 3 epochs in one day. > > We use AdamW optimizer, with 1% training...

Any plan to upgrade transformers?

> We observed different tokenizer behavior on transformer>=4.34.0. This may affect model training and we are checking. then how do you do the resume? Since the learning rate not saved

TypeError: ne() received an invalid combination of arguments - got (NoneType)

same problem