Show Currently Using CPU or Not
Sometimes ChatMLX shows answers slowly and CPU gets near 100% with 0 GPU usage. Then I know the acceleration didn't work.
After I quit ChatMLX and restart it, everything works fine again.
So maybe ChatMLX can remind me when the performance is down.
Can you tell me which model you are using? It might be an issue triggered by MLX
I didn't notice which model at that time. But I only have those models.
I just have tried all models above and all can use GPU now.
I removed a model with "vision" yesterday as it didn't work with ChatMLX. It was begin with Phi-3.5 and downloaded from MLX community. Maybe the performance dropped after the error showing in that model?