mistral.rs
mistral.rs copied to clipboard
Accelerate topk, topp sampling with `argsort`
Argsort was just added to Candle (https://github.com/huggingface/candle/pull/2132). Using an argsort kernel will accelerate the current CPU sorting part of topk
or topp
sampling, which takes a lot of time.