candle issues

Unsupported cuda toolkit version: `12040`

1

How to run LLama-3 or Phi with more then 4096 prompt tokens?

Could you please show me an example where LLama-3 model used (better GGUF quantized) and initial prompt is more then 4096 tokens long? Or better 16-64K long (for RAG). Currently...

baleksey

No backward pass for `RmsNorm` if tensor is contiguous

`RmsNorm` switches to faster implementation if tensor is contiguous: https://github.com/huggingface/candle/blob/82b641fd2752e3b14db6a9c91faef70e3329f3b5/candle-nn/src/layer_norm.rs#L174-L175 But it does not support backward pass: https://github.com/huggingface/candle/blob/82b641fd2752e3b14db6a9c91faef70e3329f3b5/candle-nn/src/ops.rs#L640 Maybe it's better to implement `ModuleT` rather than `Module` for `RmsNorm` and...

agerasev

Error: Metal error Error while loading function: "Function 'cast_bf16_f16' does not exist" with llama3

3

I made some modifications to the example code in llama3 for it to run locally, but I encountered an error during execution. I am using a MacBook with an M3...

yIllusionSky

Update metal requirement from 0.27.0 to 0.28.0

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start)...

dependabot[bot]

dependencies

~2x slower than `Transformer` on cpu with `Bert` model

2

OS: Windows 11 Model: [maidalun1020/bce-embedding-base_v1](https://huggingface.co/maidalun1020/bce-embedding-base_v1) Command: ```sh cargo run --features mkl --example bert --release -- --model-id maidalun1020/bce-embedding-base_v1 --use-pth ``` Candle(took ~15s): ```rust let s = std::fs::read_to_string( "test.txt", )?; // split...

CrazyboyQCD

Update imageproc requirement from 0.24.0 to 0.25.0

Updates the requirements on [imageproc](https://github.com/image-rs/imageproc) to permit the latest version. Changelog Sourced from imageproc's changelog. 0.24.0 - 2024-03-16 New features: Added BRIEF descriptors Added draw_antialiased_polygon Added draw_hollow_polygon, draw_hollow_polygon_mut Added contour_area...

dependabot[bot]

dependencies

Quantization issue - Mixtral 8x22b

1

Hi all. I'm currently working to the implementation of a quantized version of Mixtral 8x22b. I'm using the weights from the following repo: [MaziyarPanahi/Mixtral-8x22B-Instruct-v0.1-GGUF](https://huggingface.co/MaziyarPanahi/Mixtral-8x22B-Instruct-v0.1-GGUF). Unfortunately, expert-related tensors for each layer...

edesalve

Using MKL Documentation goes to 404

1

Hi! Looking into adding mkl, but current the doc linked here: https://huggingface.github.io/candle/guide/advanced/mkl.html (Via https://huggingface.github.io/candle/guide/installation.html -> Using mkl at the bottom) Seems to be a dead link. Cheers, Z

CoffeeVampir3

Unsupported op_type Pad for op

I have an onnx model converted from pytorch, but encountered an error using candle inference. I have confirmed that this onnx model can work properly on Python. I also checked...

mzdk100

candle
candle copied to clipboard

Metadata

Unsupported cuda toolkit version: `12040`

How to run LLama-3 or Phi with more then 4096 prompt tokens?

No backward pass for `RmsNorm` if tensor is contiguous

Error: Metal error Error while loading function: "Function 'cast_bf16_f16' does not exist" with llama3

Update metal requirement from 0.27.0 to 0.28.0

~2x slower than `Transformer` on cpu with `Bert` model

Update imageproc requirement from 0.24.0 to 0.25.0

Quantization issue - Mixtral 8x22b

Using MKL Documentation goes to 404

Unsupported op_type Pad for op

← Metadata

Owner

Metadata

candle candle copied to clipboard

Metadata

← Metadata

Owner

Metadata

candle
candle copied to clipboard