learned-tokenization topic
List
learned-tokenization repositories
MEGABYTE-pytorch
620
Stars
52
Forks
Watchers
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch
rvq-vae-gpt
77
Stars
1
Forks
Watchers
My attempts at applying Soundstream design on learned tokenization of text and then applying hierarchical attention to text generation