learned-tokenization topic
                        List
                        learned-tokenization repositories
                    
                MEGABYTE-pytorch
                            
                                620
                            
                            
                        
                        Stars
                    
                            
                                52
                            
                            
                        
                        Forks
                    Watchers
                    Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch
rvq-vae-gpt
                            
                                77
                            
                            
                        
                        Stars
                    
                            
                                1
                            
                            
                        
                        Forks
                    Watchers
                    My attempts at applying Soundstream design on learned tokenization of text and then applying hierarchical attention to text generation