GAU-alpha-pytorch icon indicating copy to clipboard operation
GAU-alpha-pytorch copied to clipboard

两层GAU替换Attention+MLP在推理速度方面好像是降低了吧

Open HackGiter opened this issue 11 months ago • 0 comments

我个人实验了seq=512,KV-Cache的情况下,GAU速度好像要慢吧。

HackGiter avatar Mar 07 '24 03:03 HackGiter