candle
candle copied to clipboard
Minimalist ML framework for Rust
would consider support new models nowadays?
### Env: GPU: NVIDIA GeForce RTX 3060, 12036MiB) CPU: 12th Gen Intel(R) Core(TM) i5-12400F OS: Ubuntu 23.04 Model: yolov8s.pt, yolov8s.onnx, yolov8s.safetensors #### speed test on 1000 images: - candle: ~55ms...
Add implementation for https://huggingface.co/naver/provence-reranker-debertav3-v1. This is still a WIP, but I wanted to gauge interest before going too far. ### Notes - Provence has a [CC Non Commercial license](https://huggingface.co/naver/provence-reranker-debertav3-v1/blob/main/Provence_LICENSE.txt) -...
Candle's convolution operations on CPU are quite slow, compared to Pytorch. # Some numbers Conv2d run configuration: - batch_size = 2 - in_channels = 3 - width = 320 -...
https://huggingface.co/collections/PaddlePaddle/pp-ocrv5
I'm running https://github.com/huggingface/candle/tree/main/candle-examples/examples/llava vs. https://github.com/fpgaminer/joycaption/blob/main/scripts/batch-caption.py on a Mac m1. Seeing significant performance difference, Candle seems much slower. I enabled accelerate and metal features. Would love some pointers how to improve...
https://huggingface.co/depth-anything/DA3-LARGE the Depth anything V3 dropped, it was extremly useful for monocular camera depth estimation, it can achieve many applications which needs precise 3D points.
Hi! I am attempting to get this working with Cuda support. Any ideas? Thank you! Works great without the CUDA flag. **Hardware:** - RTX480 with latest drivers - CUDA 13.0...
llama.cpp achieves superior CPU performance through thread-optimized kernels that compute directly on GGUF's native weight layouts. Candle should follow this approach to match llama.cpp's CPU efficiency and support diverse GGUF...
## Summary The `candle-transformers/src/models/` directory has grown to contain 70+ flat module entries, mixing full and quantized implementations of the same model families. This makes the codebase harder to navigate...