x-transformers
x-transformers copied to clipboard
N-grammer
Hey, thanks for another great repository.
Would it benefit from an N-grammer integration?
https://github.com/lucidrains/n-grammer-pytorch
It looks like you tried integrating N-grammer into a transformer, and the results were "slightly better"?
https://github.com/lucidrains/n-grammer-pytorch/issues/1#issuecomment-986101374
Context in which I'm looking into transformers is that I'm trying to train an encoder for use with your imagen-pytorch, on Danbooru tags (simplifies the domain a lot -- tokens can be few and long, representing Booru labels in short sequences, and order doesn't matter so they can be trained as sorted captions).
I was thinking to train a T5 using huggingface's masked language model flax trainer, but maybe your encoder-only transformers would be a better fit?
@Birch-san :wave: N-grammer didn't work well for me for character level autoregressive training
I was going to retry on BPE tokenized to see if the effects are more visible
thanks for confirming how the attempt went. 🙂
if you end up trying again on BPE, it'll certainly be interesting to see whether that fares any better.
it seemed from the paper that it performed slightly worse than OG transformer on average but offered a inference speed up (356.94 vs 331.12 examples/s so not huge). tho the idea is interesting