Phil Wang
Phil Wang
compressive-transformer-pytorch
Pytorch implementation of Compressive Transformers, from Deepmind
long-short-transformer
Implementation of Long-Short Transformer, combining local and global inductive biases for attention over long sequences, in Pytorch
linear-attention-transformer
Transformer based on a variant of attention that is linear complexity in respect to sequence length
tab-transformer-pytorch
Implementation of TabTransformer, attention network for tabular data, in Pytorch
routing-transformer
Fully featured implementation of Routing Transformer
sinkhorn-transformer
Sinkhorn Transformer - Practical implementation of Sparse Sinkhorn Attention
conformer
Implementation of the convolutional module from the Conformer paper, for use in Transformers
x-unet
Implementation of a U-net complete with efficient attention as well as the latest research findings
glom-pytorch
An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates concepts from neural fields, top-down-bottom-up processing, and attention (consensus between columns), for emergent...
Adan-pytorch
Implementation of the Adan (ADAptive Nesterov momentum algorithm) Optimizer in Pytorch