llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

Add ops needed for new hybrid models: SOFTPLUS, EXPM1, TRI, SOLVE_TRI, CUMSUM

Open pwilkin opened this issue 1 month ago • 4 comments

The ops needed for the new hybrid models including Qwen3 Next and Kimi Linear.

Prerequisite to merging https://github.com/ggml-org/llama.cpp/pull/16095

pwilkin avatar Nov 06 '25 21:11 pwilkin

@gabe-l-hart guess you'll be interested in this one as well :)

pwilkin avatar Nov 06 '25 21:11 pwilkin

@slaren @ggerganov Should be ready for final review.

pwilkin avatar Nov 08 '25 19:11 pwilkin

@ggerganov Aight, paralellized CUMSUM, added docs, removed TRI_KEEP, renamed TRI_KEEP to TRI, added CONST with const1234d helpers.

pwilkin avatar Nov 11 '25 00:11 pwilkin

Aight, @ggerganov @slaren @CISC it's ready to merge I think.

pwilkin avatar Nov 11 '25 21:11 pwilkin

As a constructive feedback for the future, try to split the changes in ggml in even smaller parts. It would improve the review process because there are many little details (naming, API design, code formatting) that are not obvious at first and it takes some time to get accustomed to them.

Will do, I'm still getting used to the granularity of the review process here :)

pwilkin avatar Nov 12 '25 12:11 pwilkin