candle
candle copied to clipboard
Minimalist ML framework for Rust
   ### **High-Performance Core Implementation** - **GLU**: Classic sigmoid-gated activation `σ(x_left) ⊙ x_right` - **GeGLU**: GELU-gated variant (transformer standard) - **ReGLU**:...
Running any of the quantized examples so far they all seam to have a 1024 token limit. ``` cargo run --example quantized-qwen3 --release --features cuda,cudnn -- --which 4b --prompt "1802tokens...
- Fix LayerNorm.forward() to use tensor operations instead of scalar operations - Replace sum_keepdim()/size with mean_keepdim() to preserve gradients - Use broadcast_add() with epsilon tensor instead of scalar addition -...
Hi is there no way to get word timestamp using the whisper in candle? The example successfully demonstrates the retrieval of segment timestamp but how would one retrieve word timestamp....
# LayerNorm Gradient Flow Issue in candle-nn ## Summary LayerNorm in candle-nn does not properly propagate gradients through all parameters during backpropagation, causing only 33% of parameters to receive gradients....
CUDA_COMPUTE_CAP="90,100,121" ??
[`DequantizeLinear`](https://onnx.ai/onnx/operators/onnx__DequantizeLinear.html) is related to [dynamic quantization](https://onnxruntime.ai/docs/performance/model-optimizations/quantization.html#dynamic-quantization) as performed using ONNX Runtime.