kevinpro
kevinpro
> @Ricardokevins Oh nice that you fixed it! Can I ask for some advice since I'm still facing the issue: > > * What do you mean by "replaced the...
> Thanks @Ricardokevins ! Btw I've fixed the issue by setting number of threads used for intraop parallelism to 1 > > ``` > torch.set_num_threads(1) > ``` > > This...
听起来像packing之间没有eos token
Hi, thank you for your great work! I would like to know how many V-RAM needed? I try with 8*40G, but failed with OOM.
> > Hi, thank you for your great work! I would like to know how many V-RAM needed? I try with 8*40G, but failed with OOM. > > 8x80G,8*40G only...
确实对效果挺好奇的。。。评测应该很快就有人做了
> 我看来看去,哪里写了16B? 看不懂readme的中文啊
any update on this issue? @hiyouga
Same issue here with Zero-3 Training So what should i do to solve the issue? @hijkzzz
https://github.com/facebookresearch/llama/blob/6796a91789335a31c8309003339fe44e2fd345c2/llama/model.py#L348 ``` def forward(self, x): return self.w2(F.silu(self.w1(x)) * self.w3(x)) ```