candle
candle copied to clipboard
Minimalist ML framework for Rust
https://github.com/werruww/Suc-candle-cpu/blob/main/Succ_candle_CPU.ipynb
from a To Z
This PR implements the Qwen3 Mixture of Experts models (like [Qwen3-30B-A3B](https://huggingface.co/Qwen/Qwen3-30B-A3B)) ~~Doesn't work, depends on #2930~~ (Merged)
What is the correct way to finetune yolo8 model to be used here ? Finetuning model using candle is not straightforward. candle\candle-examples\examples\yolo-v8\main.rs // model model architecture points at ultralytics :...
Are there any plans to support inference with AMD gpus? As far as I can tell in Candle there is only support for NVIDIA gpus. i am working on safe...
The current implementation of the quantized_phi3 model does not clear its kv cache between distinct prompts. This leads to errors when attempting to generate text sequentially with the same model...
Would there be any interest in adding this model? https://huggingface.co/ds4sd/SmolDocling-256M-preview I toyed around with an implementation last night but most of my experience has been with text models and am...
- [x] Model - [x] FP8 weight dequantize - [x] Tensor parallelism
This PR adds the automatic usage of Metal GGML quantized mat-mat kernels instead of always using the mat-vec kernels and upstreams a few related/necessary changes. Before this change, Candle's Metal...