[Feature]: Allow one instance to run multiple tasks/services
Problem
First, thank you for the great work on this project! I have an on-prem server with one A100 GPU, and I’m using ssh-fleet with auto block enabled. My use case is to serve up to 4 different AI services on the same server. However, it seems that currently one GPU can only serve one instance at a time. This makes it difficult to efficiently utilize the available resources when the GPU is capable of running multiple lighter services simultaneously.
Solution
It would be very helpful if a single GPU instance could run multiple tasks/services in parallel. For example:
Allow deploying multiple services to the same instance, even if ssh-fleet marks it as busy.
Optionally, provide a warning to indicate that the instance is already in use, but still let users proceed with deployment.
Perhaps add a configuration option for resource sharing, so advanced users can control how tasks are scheduled or limited.
Workaround
No response
Would you like to help us implement this feature by sending a PR?
Yes
This issue is stale because it has been open for 30 days with no activity.