Odd Nydren
Odd Nydren
I'm using llama.cpp via https://github.com/utilityai/llama-cpp-rs I get 51 t/s with a 7B model... Candle gives me 19 t/s with the same model. Agree we need Rust native - but at...
Macbook pro M1 max - I get the same error... Error building model: cannot find tensor vision_model.transformer.layers.0.self_attn.q_proj.weight
@haricot makes perfect sense, thank you ! Looking over my code again, I found other mistakes too - now fixed - and it works.
Hey very happy to see mistral.rs - thanks Eric :) I'm new to git so just to check... ..I did the following: Deleted mistral.rs git clone https://github.com/EricLBuehler/mistral.rs.git git switch best_device_in_examples...
Results: ./phi3v Error: Metal contiguous to_dtype F64 F32 not implemented ---- ./gguf_locally general.architecture: llama general.file_type: 17 general.name: mistralai_mistral-7b-instruct-v0.1 general.quantization_version: 2 llama.attention.head_count: 32 llama.attention.head_count_kv: 8 llama.attention.layer_norm_rms_epsilon: 0.00001 llama.block_count: 32 llama.context_length: 32768...
I'm new to Rust so let me know if I'm doing this wrong... bash -c "RUST_BACKTRACE=1 cargo run --example phi3v --release --features metal" Error: Metal contiguous to_dtype F64 F32 not...
Also... general.architecture: llama general.file_type: 17 general.name: mistralai_mistral-7b-instruct-v0.1 general.quantization_version: 2 llama.attention.head_count: 32 llama.attention.head_count_kv: 8 llama.attention.layer_norm_rms_epsilon: 0.00001 llama.block_count: 32 llama.context_length: 32768 llama.embedding_length: 4096 llama.feed_forward_length: 14336 llama.rope.dimension_count: 128 llama.rope.freq_base: 10000 ---> Text: Hello!...
Ah great - definitely have to try mistralrs-bench ! Seed can be specified with llama.cpp to make sure you get exactly the same run with a model...that obviously also requires...
git switch master --force git pull cargo run --example phi3v --release --features metal thread 'main' panicked at mistralrs/examples/phi3v/main.rs:85:14: internal error: entered unreachable code ---> bash -c "RUST_BACKTRACE=full cargo run --example...
Sure ! Result... thread 'main' panicked at mistralrs/examples/phi3v/main.rs:87:39: Model error: Metal error no metal implementation for bitwise. Response: Text: , Prompt T/s: inf, Completion T/s: NaN stack backtrace: 0: 0x104ca2e54...