Baichuan2 icon indicating copy to clipboard operation
Baichuan2 copied to clipboard

z_loss_weight 默认是0,给出的finetune示例也是0.所以实际没有用到z loss吗?

Open uygnef opened this issue 1 year ago • 8 comments

uygnef avatar Sep 20 '23 07:09 uygnef

z-loss was adopted in our training. But it is not necessary so we turn it off in the opensource code.

mmmans avatar Sep 20 '23 11:09 mmmans

z-loss was adopted in our training. But it is necessary so we turn it off in the opensource code.

hi @mmmans , do you mean it's unnecessary at finetune stage?

uygnef avatar Sep 20 '23 11:09 uygnef

z-loss was adopted in our training. But it is necessary so we turn it off in the opensource code.

hi @mmmans , do you mean it's unnecessary at finetune stage?

Not necessary. depends on your setting actually.

mmmans avatar Sep 20 '23 11:09 mmmans

oh, I see. thx a lot

uygnef avatar Sep 20 '23 11:09 uygnef

@mmmans I have added thousands of new tokens and made finetuning of full parameters. Do I need to set z_loss_weight?

felixfuu avatar Dec 30 '23 01:12 felixfuu

@mmmans I have added thousands of new tokens and made finetuning of full parameters. Do I need to set z_loss_weight?

depends on your own setting actually. if your training does not exhibit the training instability problem, there is no need to set z_loss

mmmans avatar Dec 30 '23 01:12 mmmans

@mmmans thx~

felixfuu avatar Dec 30 '23 02:12 felixfuu

@mmmans loss = 6.x does not converge,should set z_loss_weight?

felixfuu avatar Dec 30 '23 03:12 felixfuu