jan bug: Model recommendation under thread settings is not based on VRAM when GPU is on

bug: Model recommendation under thread settings is not based on VRAM when GPU is on

Open Van-QA opened this issue 11 months ago • 3 comments

Describe the bug The model recommendation feature under thread settings in Jan is not accounting for VRAM (Video Random Access Memory) when GPU acceleration is enabled.

Steps to reproduce

Enable GPU acceleration in Jan.
Navigate to thread settings.
Observe the model recommendation provided by Jan.
Compare the VRAM usage of the recommended model with the available VRAM on the system.
Notice that the model recommendation is not optimized based on the available VRAM, it's using RAM for comparison :x:

Expected behavior When GPU acceleration is enabled, the model recommendation under thread settings should consider the available VRAM on the system to suggest models that can run efficiently without exceeding VRAM limitations

Environment details

Operating System: Windows 11

Screenshots The issue in only happening on the recommendation of thread settings on the other hand, Hub is recommending differently, but not sure which one is correct

Logs If the cause of the error is not clear, kindly provide your usage logs:

tail -n 50 ~/jan/logs/app.log if you are using the UI
tail -n 50 ~/jan/logs/server.log if you are using the local API server Making sure to redact any private information.

Additional context Failure to optimize model recommendations based on VRAM usage may lead to inefficient resource allocation, performance degradation, and potential crashes when running models on GPU-accelerated systems.

Mar 13 '24 04:03 Van-QA

chaining this to https://github.com/janhq/jan/issues/2216

Mar 13 '24 04:03 Van-QA

Good find!

Mar 13 '24 05:03 louis-jan

Another issue: In the hub, it currently says "RAM" but we check for "VRAM" instead (GPU is on).

Mar 18 '24 01:03 louis-jan

Jan v0.4.9-346

Mar 28 '24 04:03 Van-QA

jan jan copied to clipboard

bug: Model recommendation under thread settings is not based on VRAM when GPU is on

jan
jan copied to clipboard