candle issues

Refactor: Unify f64-Tensor arithmetic operations using macro

This commit refactors the previously separate implementations of arithmetic operations (Add, Sub, Mul, Div) between f64 and Tensor types into a single, reusable macro `impl_f64_tensor_ops`.

monadbobo

ONNX: Gather for tensors of rank > 0.

5

Firstly thanks for candle, it's an impressive project and the speed at which updates are coming is really cool as a user! I'm trying to use a speedyspeech onnx model...

xd009642

How to get all layers attentions?

I only see that candle returns last_hidden_state, but not all_hidden_states and attentions. I want to get attentions. Can I submit a PR to do this? I originally wanted to define...

kitty-eu-org

some token duplicated in candle-examples trocr

Hello, there appears to be a bug in the trocr example. Possibly with the way images are tokenized. For example, this image produces 754754.7 instead of 754.7 I have added...

artavash

Add `QTensor::quantize_onto` to remove a redundant dtoh copy?

Currently, `QTensor::quantize`: - Take a tensor, assume it is on the GPU for this example - Copies the data to the CPU - Quantizes on the CPU - Copies the...

EricLBuehler

Add AVG Pooling cpu implementation

4

Try to resolve https://github.com/huggingface/candle/issues/2294

WenheLI

ONNX: unsupported op_type Split for op NodeProto

1

outputs:Err(WithBacktrace { inner: Msg("unsupported op_type Split for op NodeProto { input: [\"/model.2/cv1/act/Mul_output_0\", \"onnx::Split_64\"], output: [\"/model.2/Split_output_0\", \"/model.2/Split_output_1\"], name: \"/model.2/Split\", op_type: \"Split\", domain: \"\", attribute: [AttributeProto { name: \"axis\", ref_attr_name: \"\", doc_string:...

wanglong001

How to get raw tensor data?

2

I am trying to implement an adaptive avg pool in candle. However, I guess my implementation will require an API to get the raw data/storage (storaged in plain/flatten array format)....

WenheLI

Qwen/Qwen2-7B doesn't work properly in the example qwen

the Qwen/Qwen2-1.5B can work correct in the example, but Qwen/Qwen2-7B can't. `(base) lyn@A100DEV:~/workspace/candle/candle-examples$ cargo run --release --features cuda --example qwen -- --model 2-7b --prompt "Hello\n" Finished release [optimized] target(s) in...

Lyn-liyuan

chore: add `llama_multinode` example with `nccl`

2

#### Very barebones implementation of Llama multinode for distributed inference - Adds support for running Llama model inference across multiple nodes and GPUs. - a simple tcp server to exchange...

b0xtch

candle
candle copied to clipboard

Metadata

Refactor: Unify f64-Tensor arithmetic operations using macro

ONNX: Gather for tensors of rank > 0.

How to get all layers attentions?

some token duplicated in candle-examples trocr

Add `QTensor::quantize_onto` to remove a redundant dtoh copy?

Add AVG Pooling cpu implementation

ONNX: unsupported op_type Split for op NodeProto

How to get raw tensor data?

Qwen/Qwen2-7B doesn't work properly in the example qwen

chore: add `llama_multinode` example with `nccl`

← Metadata

Owner

Metadata

candle candle copied to clipboard

Metadata

← Metadata

Owner

Metadata

candle
candle copied to clipboard