candle icon indicating copy to clipboard operation
candle copied to clipboard

Minimalist ML framework for Rust

Results 407 candle issues
Sort by recently updated
recently updated
newest added

would consider support new models nowadays?

### Env: GPU: NVIDIA GeForce RTX 3060, 12036MiB) CPU: 12th Gen Intel(R) Core(TM) i5-12400F OS: Ubuntu 23.04 Model: yolov8s.pt, yolov8s.onnx, yolov8s.safetensors #### speed test on 1000 images: - candle: ~55ms...

Add implementation for https://huggingface.co/naver/provence-reranker-debertav3-v1. This is still a WIP, but I wanted to gauge interest before going too far. ### Notes - Provence has a [CC Non Commercial license](https://huggingface.co/naver/provence-reranker-debertav3-v1/blob/main/Provence_LICENSE.txt) -...

Candle's convolution operations on CPU are quite slow, compared to Pytorch. # Some numbers Conv2d run configuration: - batch_size = 2 - in_channels = 3 - width = 320 -...

https://huggingface.co/collections/PaddlePaddle/pp-ocrv5

I'm running https://github.com/huggingface/candle/tree/main/candle-examples/examples/llava vs. https://github.com/fpgaminer/joycaption/blob/main/scripts/batch-caption.py on a Mac m1. Seeing significant performance difference, Candle seems much slower. I enabled accelerate and metal features. Would love some pointers how to improve...

https://huggingface.co/depth-anything/DA3-LARGE the Depth anything V3 dropped, it was extremly useful for monocular camera depth estimation, it can achieve many applications which needs precise 3D points.

Hi! I am attempting to get this working with Cuda support. Any ideas? Thank you! Works great without the CUDA flag. **Hardware:** - RTX480 with latest drivers - CUDA 13.0...

llama.cpp achieves superior CPU performance through thread-optimized kernels that compute directly on GGUF's native weight layouts. Candle should follow this approach to match llama.cpp's CPU efficiency and support diverse GGUF...

## Summary The `candle-transformers/src/models/` directory has grown to contain 70+ flat module entries, mixing full and quantized implementations of the same model families. This makes the codebase harder to navigate...