[Feature]: Allow one instance to run multiple tasks/services

Open joesu-angible opened this issue 3 months ago • 1 comments

Problem

First, thank you for the great work on this project! I have an on-prem server with one A100 GPU, and I’m using ssh-fleet with auto block enabled. My use case is to serve up to 4 different AI services on the same server. However, it seems that currently one GPU can only serve one instance at a time. This makes it difficult to efficiently utilize the available resources when the GPU is capable of running multiple lighter services simultaneously.

Solution

It would be very helpful if a single GPU instance could run multiple tasks/services in parallel. For example:

Allow deploying multiple services to the same instance, even if ssh-fleet marks it as busy.

Optionally, provide a warning to indicate that the instance is already in use, but still let users proceed with deployment.

Perhaps add a configuration option for resource sharing, so advanced users can control how tasks are scheduled or limited.

Workaround

No response

Would you like to help us implement this feature by sending a PR?

Yes

Sep 17 '25 11:09 joesu-angible

This issue is stale because it has been open for 30 days with no activity.

Oct 18 '25 01:10 github-actions[bot]