ds5t5

Results 4 comments of ds5t5

@olegklimov please help review and feel free to test. The inference is extremely fast with the effort from llama.cpp.

/attempt https://github.com/smallcloudai/refact/issues/77 Options Cancel my attempt

thanks. let me know when it is ready for model weight. i will rebase my llama.cpp PR to the latest branch of llama.cpp.

@JegernOUTT can i ask why we decided to make the weight change? it seems not quite aligned with other popular models. they (falcon, llama) usually keep mlp.linear_1 and mlp.linear_3 separately....