Tianhao Gao

Results 18 comments of Tianhao Gao

> > > 调bloomz的时候会有同样的错误,alpaca-lora,alpaca-Cot和BELLE上面都会报这个错误。有无好心人告诉怎么解决 > > > > > > 破案了,我之前用的P40,换了A100能用了 > > A100哪里来的,穷人家的孩子表示很羡慕 hhh,都是公司的,搞了8张A100测试

hi, do you met error "expected scalar type Half but found Float" after change model to bloomz?

> same error I have fixed this problem. ref:https://github.com/microsoft/DeepSpeedExamples/issues/571

> > > @LiinXemmon Hi, this is caused by log(0) which will return `inf`, I think you should a very small value to difference of two sentences' reward(like 1e-7), it...

> According to readme, "We have found that it is very unstable to use different generation training batch sizes (--per_device_train_batch_size) and PPO training batch sizes (--per_device_mini_batch_size), more than one PPO...

> hello, do you solve it? my average reward is still not increasing during training. > > > > According to readme, "We have found that it is very unstable...