candle
candle copied to clipboard
Minimalist ML framework for Rust
Most models use identical of almost identical copies of RotaryEmbedding (cfg.rope_theta vs hardcoded 10000, rope_theta being f32 or f64, chunk() vs 2 calls to narrow() ). A few others (mixformer,...
Win11 Error: DriverError(CUDA_ERROR_NOT_FOUND, "named symbol not found") when loading cast_f32_bf16
I am trying to run falcon locally on my machine, on main branch, through: `cargo run --release --features metal --example falcon -- --prompt "write a hello world rust program"` which...
I am testing different model architectures, and when loading the model weights (e.g. for falcon or mamba architectures) with precision either `bf16` or `f16` I usually get this error: `Candle...
CARGO_PROFILE_RELEASE_BUILD_OVERRIDE_DEBUG=true warning: some crates are on edition 2021 which defaults to `resolver = "2"`, but virtual workspaces default to `resolver = "1"` note: to keep the current resolver, specify `workspace.resolver...
Great framework! Is the usage of Metal already possible on iOS? I'm trying to run the Phi example on iOS and I can only get it to work with a...
Operations on tensors with zero-length dimensions are supported in other libraries such as PyTorch, and would be nice to have support for here. For example, when I multiply a 0-by-K...
I've noticed that the generation diverges after some tokens in comparison to the HF implementation. Is this expected? Here's how to reproduce: **Transformers** ```python import torch from transformers import AutoTokenizer,...
I have followed the tutorial and set up my first rust example. However, I found that the inference speed is faster compared to torch on GPU (780ms per image vs...
Hello! i am a students in korea. trying to make trainging code for llm , i encounted some problem. my code referencing "llama2-c > training.rs" code, they use like this....