Yang Liu
Results
1
issues of
Yang Liu
@ConnorJL Thanks for the great work. Unfortunately, I found out my training using OpenWebTextCorpus is too slow even for 117M model. The cross entropy loss function decreases rapidly before 10k...