mans
mans
> Hi, > > I apologise for the delay. When I get some spare time after some current research projects I will upload this code. looking foward to the code...
No. We user xformers for training, and naive impl for inference.
Can you provide Code snippets to reproduce ?
Can you provider the corresponding stentence of input ?
You need to manipulate the NormHead.weight. according > https://huggingface.co/docs/transformers/main_classes/model#transformers.PreTrainedModel.resize_token_embeddings You just should init a new parameter with desired shape, let's say "new_param". new_param[: origin_vocab_size] = normhead.weight. replace the normhead weight...
@shyoulala @snowlixue @Ignoramus0817 refer to https://github.com/baichuan-inc/Baichuan2/issues/155
z-loss was adopted in our training. But it is not necessary so we turn it off in the opensource code.
> > z-loss was adopted in our training. But it is necessary so we turn it off in the opensource code. > > hi @mmmans , do you mean it's...
> @mmmans I have added thousands of new tokens and made finetuning of full parameters. Do I need to set z_loss_weight? depends on your own setting actually. if your training...
How long is your text?