Evan Quiney
Evan Quiney
MLX supports many computational backends - cpu, cuda, metal and gpu. It should be easy to specify this information as a parallel part of the InstanceMeta, and pass it into...
This is an extension of #964 with some cleanup.
## Describe the bug In the dashboard, the minimum model size is 4GB. This makes it impossible to attempt to load smaller models on devices 4GB of memory, and often...
## Motivation We should ensure all runners are connected before loading the model - this gives us finer grained control in the future for the workers planning mechanism over the...