keras-nlp icon indicating copy to clipboard operation
keras-nlp copied to clipboard

Add support for learnable relative position encoding

Open chenmoneygithub opened this issue 2 years ago • 2 comments

Relative postion is useful for text of arbitrary length. Our DeBERTa model now has a relative postional encoding, but it now only returns the repeated embedding matrix: code link

I made a quick implementation based on TF model garden's offering (not fully tested): https://colab.research.google.com/gist/chenmoneygithub/bd44a36f9249a2715b0ccb8b18733f14/learnable-relative-postional-encoding.ipynb

Let's discuss if we want this layer, and we can probably make this issue contribution welcome.

chenmoneygithub avatar Jan 02 '23 22:01 chenmoneygithub

Would this be something we could use from DeBERTa @chenmoneygithub @abheesht17 if we get the right initialization? Or are the weights/graph too different?

If we cannot use this for DeBERTa, what models do use this style relative embedding?

mattdangerw avatar Jan 03 '23 19:01 mattdangerw

We can use it for DeBERTa if my understanding is correct, Abheesht should have more context.

It's just a general approach (paper), and I am planning to use it for some custom model for text summarization, which cannot use positional embedding because of the input size.

chenmoneygithub avatar Jan 03 '23 19:01 chenmoneygithub