[Feature] Provide a public API to scale services
Use-case: up-scale and down-scale services programmatically via an API
This is related to supporting updating the service replicas parameter in configuration without re-creating the service, that is, service configuration in-place updates.
This is related to supporting updating the service
replicasparameter in configuration without re-creating the service, that is, service configuration in-place updates.
Not sure. The issue is about providing an API that does what the server already does with auto-scaling of a running service based on the scaling configuration. Means we already have an internal API for this. We just need to make it public.
@peterschmidt85, We have a public API that allows specifying replicas. If we adjust it to support service configuration in-place updates, users could use it to modify replicas and implement custom autoscaling on top of it.
This issue is stale because it has been open for 30 days with no activity.
After #1958, users can change replicas and scaling parameters via dstack apply or Python/HTTP API:
import os
from dstack.api.server import APIClient
url = os.environ["DSTACK_URL"]
token = os.environ["DSTACK_TOKEN"]
project = os.environ["DSTACK_PROJECT"]
client = APIClient(base_url=url, token=token)
run = client.runs.get(project, "my-run")
new_run_spec = run.run_spec
new_run_spec.configuration.replicas = 3
plan = client.runs.get_plan(project, new_run_spec)
updated_run = client.runs.apply_plan(project, plan)