cortex
cortex copied to clipboard
bug: nitro cuda windows low performance on machine has multiple GPUs - tested using Jan App
Describe the bug My windows machine has 3 GPUs, when I enabled all 3 GPUs, the token speed was slow (6-9/s) and it even not able to load tinyllama 1B. When I disabled 2 GPUs, 1 active only, the performance was back to normal
Screenshots
-
3 GPUs active
- Low performance
- Load tinyllama error
-
1 GPU active only, then the performance was back to normal
Desktop (please complete the following information):
- OS: Windows 11
- Nvidia driver: 531.18
- cuda version: 12.3
- Nitro version: 0.1.27
- GPU:
- 1 RTX 4070ti
- 2 RTX 1660ti