cortex.cpp icon indicating copy to clipboard operation
cortex.cpp copied to clipboard

Add Multi-GPU Support for LlamaCpp Engine

Open nguyenhoangthuan99 opened this issue 1 year ago • 0 comments

Add Multi-GPU Support for LlamaCpp Engine

Description

We need to implement multi-GPU support for our LlamaCpp wrapper engine to improve performance and allow users to utilize multiple GPUs effectively.

Goals

  • Allow users to choose which available GPUs to use for running the engine
  • Implement load balancing across selected GPUs
  • Maintain compatibility with single-GPU setups

Proposed Implementation

  1. Detect available GPUs on the system
  2. Add a configuration option for users to specify which GPUs to use
  3. Modify the wrapper engine to distribute workload across selected GPUs

Acceptance Criteria

  • [ ] Users can specify which GPUs to use via configuration
  • [ ] The engine correctly utilizes all selected GPUs

Additional Considerations

  • Ensure proper error handling for scenarios where specified GPUs are unavailable
  • Will we add this feature to model.yml for model management?
  • Is this feature works for both CLI and API?

nguyenhoangthuan99 avatar Oct 02 '24 08:10 nguyenhoangthuan99