candle icon indicating copy to clipboard operation
candle copied to clipboard

Minimalist ML framework for Rust

Results 407 candle issues
Sort by recently updated
recently updated
newest added

This commit refactors the previously separate implementations of arithmetic operations (Add, Sub, Mul, Div) between f64 and Tensor types into a single, reusable macro `impl_f64_tensor_ops`.

Firstly thanks for candle, it's an impressive project and the speed at which updates are coming is really cool as a user! I'm trying to use a speedyspeech onnx model...

I only see that candle returns last_hidden_state, but not all_hidden_states and attentions. I want to get attentions. Can I submit a PR to do this? I originally wanted to define...

Hello, there appears to be a bug in the trocr example. Possibly with the way images are tokenized. For example, this image produces 754754.7 instead of 754.7 I have added...

Currently, `QTensor::quantize`: - Take a tensor, assume it is on the GPU for this example - Copies the data to the CPU - Quantizes on the CPU - Copies the...

Try to resolve https://github.com/huggingface/candle/issues/2294

outputs:Err(WithBacktrace { inner: Msg("unsupported op_type Split for op NodeProto { input: [\"/model.2/cv1/act/Mul_output_0\", \"onnx::Split_64\"], output: [\"/model.2/Split_output_0\", \"/model.2/Split_output_1\"], name: \"/model.2/Split\", op_type: \"Split\", domain: \"\", attribute: [AttributeProto { name: \"axis\", ref_attr_name: \"\", doc_string:...

I am trying to implement an adaptive avg pool in candle. However, I guess my implementation will require an API to get the raw data/storage (storaged in plain/flatten array format)....

the Qwen/Qwen2-1.5B can work correct in the example, but Qwen/Qwen2-7B can't. `(base) lyn@A100DEV:~/workspace/candle/candle-examples$ cargo run --release --features cuda --example qwen -- --model 2-7b --prompt "Hello\n" Finished release [optimized] target(s) in...

#### Very barebones implementation of Llama multinode for distributed inference - Adds support for running Llama model inference across multiple nodes and GPUs. - a simple tcp server to exchange...