keras-nlp icon indicating copy to clipboard operation
keras-nlp copied to clipboard

Add the gMLP Encoder Block

Open abheesht17 opened this issue 2 years ago • 2 comments

The gMLP model is from the paper "Pay Attention to MLPs". It has a decent number of citations - around 40. Every Encoder Block merely consists of linear layers, a "spatial gating unit", etc. Will be a good addition to the library, considering the research world is trying to find alternatives for self-attention, and because despite the simplicity of this model, it does achieve comparable performance with Transformers.

abheesht17 avatar Apr 08 '22 03:04 abheesht17

@abheesht17 Thanks for opening this feature request!

The idea of the paper is definitely interesting! But at this moment I am not convinced that gMLP can be a good replacement to transformer. It claims that less parameters would be required, but we can also control the number of encoders or their SGUs? We will have more discussions over this next week, and you could also add this to your GSoC proposal if you want, thanks again!

chenmoneygithub avatar Apr 09 '22 22:04 chenmoneygithub

Awesome! Thanks, @chenmoneygithub. Will add it to the doc :)

abheesht17 avatar Apr 10 '22 02:04 abheesht17