FastChat ALiBi positional encoding

ALiBi positional encoding

Open aamir-gmail opened this issue 1 year ago • 0 comments

Based on the paper from Bloomberg GPT, the reference here https://arxiv.org/pdf/2303.17564v1.pdf They mention sequence length of more than 2048 can be used during inference https://paperswithcode.com/method/alibi Can the LLAMA model be modified to use this?

Apr 10 '23 03:04 aamir-gmail

FastChat FastChat copied to clipboard

ALiBi positional encoding

FastChat
FastChat copied to clipboard