jan
jan copied to clipboard
Hub should autodetect users' RAM or VRAM to Recommend Models
Objective
- As part of larger epic, we need to autodetect the users' hardware and show recommending models
- Our long-term goal is to help the user "run best inference quality for their given hardware".
- [Stretch Goal] Is there a way for us to also autoconfigure inference engine params (e.g. GPU layer offloading)?
Questions
- What happens if users have both GPU and RAM?
- Are there edge cases (e.g. a user with a 4090 and somehow only 8gb of RAM - perhaps Windows eGPU users?)
Notes on some principle to follow:
- Should try to avoid system permission (infer hardware from a more generic way without high system permission)
- Windows, Linux, MacOS is different
- Should be automatic for first time user
Deprecated. We have new designs