ds5t5
ds5t5
@olegklimov please help review and feel free to test. The inference is extremely fast with the effort from llama.cpp.
/attempt https://github.com/smallcloudai/refact/issues/77 Options Cancel my attempt
thanks. let me know when it is ready for model weight. i will rebase my llama.cpp PR to the latest branch of llama.cpp.
@JegernOUTT can i ask why we decided to make the weight change? it seems not quite aligned with other popular models. they (falcon, llama) usually keep mlp.linear_1 and mlp.linear_3 separately....