commented-transformers
commented-transformers copied to clipboard
Highly commented implementations of Transformers in PyTorch
Commented Transformers
Highly commented implementations of Transformers in PyTorch for Creating a Transformer From Scratch series:
The layers folder contains implementations for Bidirectional Attention, Causal Attention, and CausalCrossAttention.
The models folder contains single file implementations for GPT-2 and BERT. Both models are compatible with torch.compile(..., fullgraph=True)
.