Add ops needed for new hybrid models: SOFTPLUS, EXPM1, TRI, SOLVE_TRI, CUMSUM

Open pwilkin opened this issue 1 month ago • 4 comments

The ops needed for the new hybrid models including Qwen3 Next and Kimi Linear.

Prerequisite to merging https://github.com/ggml-org/llama.cpp/pull/16095

Nov 06 '25 21:11 pwilkin

@gabe-l-hart guess you'll be interested in this one as well :)

Nov 06 '25 21:11 pwilkin

@slaren @ggerganov Should be ready for final review.

Nov 08 '25 19:11 pwilkin

@ggerganov Aight, paralellized CUMSUM, added docs, removed TRI_KEEP, renamed TRI_KEEP to TRI, added CONST with const1234d helpers.

Nov 11 '25 00:11 pwilkin

Aight, @ggerganov @slaren @CISC it's ready to merge I think.

Nov 11 '25 21:11 pwilkin

As a constructive feedback for the future, try to split the changes in ggml in even smaller parts. It would improve the review process because there are many little details (naming, API design, code formatting) that are not obvious at first and it takes some time to get accustomed to them.

Will do, I'm still getting used to the granularity of the review process here :)

Nov 12 '25 12:11 pwilkin