candle issues

failed to build cudarc -- unsupported cuda toolkit version: `11040`

I've gotten this error while trying to build inside the official CUDA Docker image `nvidia/cuda:11.4.3-cudnn8-devel-ubuntu20.04` : ``` error: failed to run custom build command for `cudarc v0.11.1` Caused by: process...

siddthartha

Jina Bert Example fix and more configuration

Jina Bert Example did not show mean pooled output and was not able to run other models than the English one without having it locally

JoanFM

Latest commit on cudarc seems to have broken running the examples codes

28

$ cargo run --example quantized --release --features cuda -- --which 7b-open-chat-3.5 --prompt interactive Updating crates.io index Finished `release` profile [optimized] target(s) in 1.83s Running `target/release/examples/quantized --which 7b-open-chat-3.5 --prompt interactive` avx:...

hololite

Implement `torch.bucketize`

2

Hello all, We are implementing the Idefics 2 model on mistral.rs, but the HF Transformers code [here](https://github.com/huggingface/transformers/blob/c96aca3a8d66d64f868a3e3967be624d79213bef/src/transformers/models/idefics2/modeling_idefics2.py#L181-L182) uses `torch.bucketize` as a critical part of the code. Is it possible to...

EricLBuehler

feat(gemm): implement Gemm operator in candle-onnx

Gemm operator is very common in ONNX files generated by Pytorch, such as when using `torch.nn.Linear`. I'm not super proficient in Rust, so appreciate any modification to the current code...

socathie

Example with model via `include_bytes!`?

Re: the use-case I mentioned in #2177 where you can use include_bytes! and SliceSafetensors to ship a model directly embedded in the output binary. Would it be useful to have...

boustrophedon

Whisper microphone example outputs gibberish

5

I am trying to get the Candle Whisper microphone example to work but to no avail. First, I encountered an issue related to the microphone being repeatedly reacquired, which causes...

krzysztofwos

`sort_last_dim` fails on cuda

``` [mlx_core/src/sampler.rs:48:9] &probs = Tensor[dims 32000; f32, cuda:0] thread '' panicked at /src/lib.rs:133:25: run_engine error: DriverError(CUDA_ERROR_INVALID_VALUE, "invalid argument") 0: ::w 1: ::cuda_fwd 2: candle_core::storage::Storage::apply_op1 --- | NVIDIA-SMI 545.23.08 Driver Version:...

lucasavila00

qwen sse api

1

I wrote an example of qwen's sse using Axum

sunnyregion

Mutable state in `MultiHeadAttention` Structure and its impact on Concurrency

3

Hello Candle team, We found that a `&mut self` is used in [`MultiHeadAttention::forward()`](https://github.com/huggingface/candle/blob/9b1158b3158dae2eafb91e9da126f66bf9e111d6/candle-transformers/src/models/whisper/model.rs#L91C12-L91C12) method because of the need of updating `kv_cache`. This leads to the fact that everything build upon...

WenqingZong

candle
candle copied to clipboard

Metadata

failed to build cudarc -- unsupported cuda toolkit version: `11040`

Jina Bert Example fix and more configuration

Latest commit on cudarc seems to have broken running the examples codes

Implement `torch.bucketize`

feat(gemm): implement Gemm operator in candle-onnx

Example with model via `include_bytes!`?

Whisper microphone example outputs gibberish

`sort_last_dim` fails on cuda

qwen sse api

Mutable state in `MultiHeadAttention` Structure and its impact on Concurrency

← Metadata

Owner

Metadata

candle candle copied to clipboard

Metadata

← Metadata

Owner

Metadata

candle
candle copied to clipboard