cortex.cpp

cortex.cpp copied to clipboard

Reame
Issues

Add Multi-GPU Support for LlamaCpp Engine

Open nguyenhoangthuan99 opened this issue 1 year ago • 0 comments

Add Multi-GPU Support for LlamaCpp Engine

Description

We need to implement multi-GPU support for our LlamaCpp wrapper engine to improve performance and allow users to utilize multiple GPUs effectively.

Goals

Allow users to choose which available GPUs to use for running the engine
Implement load balancing across selected GPUs
Maintain compatibility with single-GPU setups

Proposed Implementation

Detect available GPUs on the system
Add a configuration option for users to specify which GPUs to use
Modify the wrapper engine to distribute workload across selected GPUs

Acceptance Criteria

[ ] Users can specify which GPUs to use via configuration
[ ] The engine correctly utilizes all selected GPUs

Additional Considerations

Ensure proper error handling for scenarios where specified GPUs are unavailable
Will we add this feature to model.yml for model management?
Is this feature works for both CLI and API?

Oct 02 '24 08:10 nguyenhoangthuan99