candle issues

Add high-performance GLU activation variants (GLU, GeGLU, ReGLU) with comprehensive benchmarkingAr develop

![20250629_1311_GLU Function Graph_simple_compose_01jyxp3a2zevx8b4q9e3wjcvpm](https://github.com/user-attachments/assets/a1cde0fa-2225-4b86-872c-a277532b7ea9) ![20250629_1314_GeGLU and ReGLU Functions_simple_compose_01jyxp8ptne0abjprdmrzhd5b8](https://github.com/user-attachments/assets/b2052191-8926-4ab9-bbee-5146ef184558) ![20250629_1329_Activation Functions Visualized_simple_compose_01jyxq2b5peq9vfb148nhf60nb](https://github.com/user-attachments/assets/84f27919-276e-4d1d-b8e3-d97909a57552) ### **High-Performance Core Implementation** - **GLU**: Classic sigmoid-gated activation `σ(x_left) ⊙ x_right` - **GeGLU**: GELU-gated variant (transformer standard) - **ReGLU**:...

artem1984A

Short context length on Qwen quantized examples.

1

Running any of the quantized examples so far they all seam to have a 1024 token limit. ``` cargo run --example quantized-qwen3 --release --features cuda,cudnn -- --which 4b --prompt "1802tokens...

AlpineVibrations

Fix LayerNorm gradient flow issue

2

- Fix LayerNorm.forward() to use tensor operations instead of scalar operations - Replace sum_keepdim()/size with mean_keepdim() to preserve gradients - Use broadcast_add() with epsilon tensor instead of scalar addition -...

tymat

Word Timestamp for whisper

2

Hi is there no way to get word timestamp using the whisper in candle? The example successfully demonstrates the retrieval of segment timestamp but how would one retrieve word timestamp....

bp7968h

candle
candle copied to clipboard

Metadata

Add high-performance GLU activation variants (GLU, GeGLU, ReGLU) with comprehensive benchmarkingAr develop

Short context length on Qwen quantized examples.

Fix LayerNorm gradient flow issue

Word Timestamp for whisper

LayerNorm Gradient Flow Issue in candle-nn

Build for multiple arch?

Unsupported ONNX operator: DequantizeLinear

← Metadata

Owner

Metadata

candle candle copied to clipboard

Metadata

Add high-performance GLU activation variants (GLU, GeGLU, ReGLU) with comprehensive benchmarkingAr develop

Short context length on Qwen quantized examples.

Fix LayerNorm gradient flow issue

Word Timestamp for whisper

LayerNorm Gradient Flow Issue in candle-nn

Build for multiple arch?

Unsupported ONNX operator: DequantizeLinear

← Metadata

Owner

Metadata

candle
candle copied to clipboard