x-transformers N-grammer

Hey, thanks for another great repository.
Would it benefit from an N-grammer integration?
https://github.com/lucidrains/n-grammer-pytorch

It looks like you tried integrating N-grammer into a transformer, and the results were "slightly better"?
https://github.com/lucidrains/n-grammer-pytorch/issues/1#issuecomment-986101374

Context in which I'm looking into transformers is that I'm trying to train an encoder for use with your imagen-pytorch, on Danbooru tags (simplifies the domain a lot -- tokens can be few and long, representing Booru labels in short sequences, and order doesn't matter so they can be trained as sorted captions).
I was thinking to train a T5 using huggingface's masked language model flax trainer, but maybe your encoder-only transformers would be a better fit?

Jul 16 '22 15:07 Birch-san

@Birch-san :wave: N-grammer didn't work well for me for character level autoregressive training

I was going to retry on BPE tokenized to see if the effects are more visible

Jul 25 '22 16:07 lucidrains

thanks for confirming how the attempt went. 🙂
if you end up trying again on BPE, it'll certainly be interesting to see whether that fares any better.

Jul 27 '22 14:07 Birch-san

it seemed from the paper that it performed slightly worse than OG transformer on average but offered a inference speed up (356.94 vs 331.12 examples/s so not huge). tho the idea is interesting

Aug 03 '22 21:08 robflynnyh

x-transformers x-transformers copied to clipboard

N-grammer

x-transformers
x-transformers copied to clipboard