replicate-python icon indicating copy to clipboard operation
replicate-python copied to clipboard

Deployment functionality

Open vishnubob opened this issue 2 years ago • 2 comments

I would like to be able to spin up and shutdown deployments from the API. From looking over the API and python client, this doesn’t seem possible. Am I missing something or would it be possible to add this functionality?

Thanks!

vishnubob avatar Dec 25 '23 21:12 vishnubob

Hi, @vishnubob. You're correct that Replicate doesn't currently expose any APIs for managing deployments. However, you can configure your deployment with a min / max number of concurrent predictions to handle, and the autoscaler will spin up and down down model instances based on inbound requests.

mattt avatar Jan 30 '24 18:01 mattt

Hi @mattt, thanks for your response. I am using replicate for an interactive photobooth, so my use case is a bit unusual. Since the installation is temporal, I only need the deployment while the installation is available. In order to reduce any latency, I standup a single node deployment while the installation is available, and spin down the nodes when I strike. However, it's a complicated installation, and I sometimes forget to spin down the deployments during strike, so I end up paying for idle deployments. Being able to automate the deployment from the software would be a huge win.

For now, I have transitioned this part of the project to tailscale which lets me use my own server at home, but if I could automate the deployment, I would switch back to using replicate.

vishnubob avatar Jan 31 '24 00:01 vishnubob

Closing the loop on this —

Replicate's API now supports creating and modifying deployments. Support for these endpoints was added to the Python client by https://github.com/replicate/replicate-python/pull/258, and is available in recent versions.

mattt avatar Jul 18 '24 12:07 mattt