xiao-xian
Results
1
comments of
xiao-xian
Many thanks @MarcusLoppe!! I pull your branch and use the above notebook to run the training. The training loss for encoder is around 0.28:  However for transformer, it never...