Orevantum

Results 1 comments of Orevantum

Is it possible to implement [multi-query attention](https://twitter.com/GuillaumeLample/status/1634587466416914433?s=09) then? > In `whisper.cpp` I tried using FA in the Decoder and it did not help (it does help a lot in the...