Cloud serving option
I'd love if this was on ramalama's roadmap:
ramalama serve-remote-gcp <model_path>
I'd imagine in this case it would be looking for GCP credentials via ENV_VAR and then it would use the GCP SDK to bring into existence a templated VM that was primed for running the model, big GPU, etc, probably with ramalama auto-installed on said VM.
I'd love if it came with a timer and cost estimator too :-) like "Your ramalama remote is now available at <temp_GCP_world_addressable_IP>:8080, it will run for one hour and should cost about $1.56"
Probably need a nice solution to make the API only accessible with a token in the header that gets to be an exported env var as well.
Why would this be specific to GCP? Wouldn't this be useful for Amazon Cloud, Azure, IBM Cloud, OCI ...
Nothing specific to a cloud provider, just an example. but perhaps would have been clearer if I imagined it as:
ramalama serve-remote <cloud_provider> <model_path>
If we were to do this, I would go with some syntax like:
ramalama serve --cloud <cloud_provider> <model_path>
But I think this would be a HUGE amount of work to support multiple cloud vendors and need to embed lots of knowledge about starting containers in the cloud.
- starting the remove VM
- somehow getting the MODEL pulled to the remote VM
- starting a podman.service in the remote VM.
- Connecting to the remote podman.service to pull and start a container to run the VM.
No movement on this one. I would like to get to the point as we add more services to be able to access remote servers via IP addresses.
I agree it would be a lot of work. But I think the ability to ad-hoc, with the ease of local ramalama serve, fire up a remote endpoint under user control is powerful stuff.
I also think your list of points above are probably the main steps that need doing. I'd like to contribute to this project so I wonder if we could start to form those initial steps into a plan of how the maintainers would like to see the work done? Then break out some issues that I (and maybe other) could work on?
A friendly reminder that this issue had no activity for 30 days.
Sorry this one got lost in the flow, @bentito we would love to see an example of how this would work.
A friendly reminder that this issue had no activity for 30 days.
Since we never got additional information, closing. If new information on requirements then we can reopen the issue.