Compilation fails on macOS due to .zip(devices)
Describe the bug
Log output for building with --features metal:
error[E0425]: cannot find value devicesin this scope --> mistralrs-core/src/pipeline/isq.rs:194:26 | 194 | .zip(devices) | ^^^^^^^ help: a local variable with a similar name exists:device`
warning: unused import: indicatif::ProgressIterator
--> mistralrs-core/src/pipeline/isq.rs:191:21
|
191 | use indicatif::ProgressIterator;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: #[warn(unused_imports)] on by default
For more information about this error, try rustc --explain E0425.
warning: mistralrs-core (lib) generated 1 warning
error: could not compile mistralrs-core (lib) due to 1 previous error; 1 warning emitted
warning: build failed, waiting for other jobs to finish...
vincent@MBPM3MVLB mistral.rs % cargo build --release --features metal
Compiling mistralrs-core v0.2.5 (/Users/vincent/LLM/mistralrs/mistral.rs/mistralrs-core)
error[E0425]: cannot find value devices in this scope
--> mistralrs-core/src/pipeline/isq.rs:194:26
|
194 | .zip(devices)
| ^^^^^^^ help: a local variable with a similar name exists: device
warning: unused import: indicatif::ProgressIterator
--> mistralrs-core/src/pipeline/isq.rs:191:21
|
191 | use indicatif::ProgressIterator;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: #[warn(unused_imports)] on by default
For more information about this error, try rustc --explain E0425.
warning: mistralrs-core (lib) generated 1 warning
error: could not compile mistralrs-core (lib) due to 1 previous error; 1 warning emitted`
Latest commit or version
Which commit or version you ran with: b20b818
Proposed solution
I changed line 194 in isq.rs to .zip(devices_and_dtypes) and that resolved the issue. Also seemed the correct solution to me, since that variable is also used in the non-metal devices.
@vlbosch thanks for reporting this, I just merged #706. Can you please confirm master works now?
@EricLBuehler Thanks for the quick reply! I can confirm that master builds correctly now.
Maybe another issue, or I don't understand how ISQ works, but when trying to run a model with ISQ HQQ4, it doesn't work. First I tried Mistral Large with the following command: 'mistralrs-server --isq HQQ4 -i --throughput plain --model-id /Users/vincent/LLM/Mistral-Large-Instruct-2407 --arch mistral'
It keeps pushing RAM, which forces zsh to kill the process before it is finished quantizing. Even on an 128GB M3 Max. Maybe quantized files should intermediately be pushed to the SSD and after ISQ is finished, the quantized files be loaded in RAM? Or am I doing something wrong?
Also tried with smaller models, but then they never generate a response. Gemma gives the following error: 'thread 'main' panicked at mistralrs-core/src/pipeline/isq.rs:200:30:
called Result::unwrap() on an Err value: Msg("Metal strided to_dtype F64 F16 not implemented")"
@vlbosch can you please open another issue with the strange behavior, showing the GPU/CPU memory usage during ISQ, perhaps with a video? Closing this now, but please create the other issue!