JoyR
Results
2
comments of
JoyR
FYI, http://wortschatz.uni-leipzig.de/en/download still work. http://pcai056.informatik.uni-leipzig.de/downloads/corpora/zho_news_2007-2009_1M.tar.gz
> Transformer use Layer Normalization rather than batch normalization. Layer Normalization need not consider the batch information. see [Layer Normalization](https://arxiv.org/pdf/1607.06450.pdf) at the end of page 2 However, I doubt that...