x-transformers Question about Over-Smoothing problem?

Question about Over-Smoothing problem?

Open Baran-phys opened this issue 1 year ago • 2 comments

I wonder if you @lucidrains, have any suggestions for the over-smoothing problem with Transformer models (both decoder and encoder).

May 29 '23 20:05 Baran-phys

@Hosein47 do you have experiments setup to measure oversmoothing? part of me wonders if it is even a problem worth solving, given chatgpt has shown scale and data matters way more than architectural tweaks

however, if you have the right setup, i'd be willing to build in a solution i've seen from GNN literature, provided you run the experiments and share your plots here

Jun 22 '23 19:06 lucidrains

try https://github.com/lucidrains/x-transformers#gated-residual for starters, and if you see it alleviate oversmoothing, i can add a simpler technique

Jun 22 '23 19:06 lucidrains

x-transformers x-transformers copied to clipboard

Question about Over-Smoothing problem?

x-transformers
x-transformers copied to clipboard