Yang Liu

Results 1 issues of Yang Liu

@ConnorJL Thanks for the great work. Unfortunately, I found out my training using OpenWebTextCorpus is too slow even for 117M model. The cross entropy loss function decreases rapidly before 10k...