Markus / Mark
Markus / Mark
Maybe Z.AI is willing to cover the API tokens for evaluation of this one? The best I can offer personally is GLM4.6 IQ3_XSS quant a 5 tok/s generation speed, which...
What about using Kimi K2 Thinking for this? It being open source is an important benefit
I strongly support this proposal. About specific algorithm i don't have strong opinions, but i've seen that the current metrics are no longer good indicators, not only for summarization but...
Yes distilling it to an encoder model sounds like a good and pragmatic idea. I was thinking it could be better to use Kimi K2 Thinking if it is good...
On that note I thought It'd be appropriate either way to add in a model request: #1360