Phil Wang
Phil Wang
complex-valued-transformer
Implementation of the transformer proposed in "Building Blocks for a Complex-Valued Transformer Architecture"
autoregressive-linear-attention-cuda
CUDA implementation of autoregressive linear attention, with all the latest research findings
MaMMUT-pytorch
Implementation of MaMMUT, a simple vision-encoder text-decoder architecture for multimodal tasks from Google, in Pytorch
mixture-of-attention
Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts
self-rewarding-lm-pytorch
Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI
rvq-vae-gpt
My attempts at applying Soundstream design on learned tokenization of text and then applying hierarchical attention to text generation
retrieval-augmented-ddpm
Implementation of Retrieval-Augmented Denoising Diffusion Probabilistic Models in Pytorch
evolutionary-design-molecules
Implementation of the algorithm detailed in paper "Evolutionary design of molecules based on deep learning and a genetic algorithm"
phasic-policy-gradient
An implementation of Phasic Policy Gradient, a proposed improvement of Proximal Policy Gradients, in Pytorch
taylor-series-linear-attention
Explorations into the recently proposed Taylor Series Linear Attention