d2l-en
d2l-en copied to clipboard
[suggestion of presentation] pre-norm transformers
the figure&description in transformer.ipynb follows its historical post-norm version. Since (almost?) all modern projects uses pre-norm transformers, maybe it'd be presented head-up, not until in ViT section, to give newbies (like me) an optimized first-impression, and leave the historically significant version to a lesser position.
Thanks! cheers for your great book!