Shu Wang

Results 9 comments of Shu Wang

OK, I will change the license.

This is a separate script added to the ./chat/ folder, won't affect the other script logic.

> How about your loss values? Do they become nan after step 1? I believe this might be related to the loss nan problem.

Same issue. Any suggestions on the finetune learning rate or any settings to fix it? iter 0 step 0: loss 1.7089, train time: 1409.17ms iter 1 step 0: loss 2.1155,...

I still have this problem using the newest source code. I am using INCITE-3b with dolly (taking about 8+GB) in 1 GPU (2080Ti) and 16-true precision.

I will try, which hyper-parameters are you using?

It might exceed the access window. You can forward this issue to [email protected] with your previous application email. Once this issue has been solved, please download the dataset as soon...