ramalama icon indicating copy to clipboard operation
ramalama copied to clipboard

Cloud serving option

Open bentito opened this issue 10 months ago • 5 comments

I'd love if this was on ramalama's roadmap:

ramalama serve-remote-gcp <model_path>

I'd imagine in this case it would be looking for GCP credentials via ENV_VAR and then it would use the GCP SDK to bring into existence a templated VM that was primed for running the model, big GPU, etc, probably with ramalama auto-installed on said VM.

I'd love if it came with a timer and cost estimator too :-) like "Your ramalama remote is now available at <temp_GCP_world_addressable_IP>:8080, it will run for one hour and should cost about $1.56"

Probably need a nice solution to make the API only accessible with a token in the header that gets to be an exported env var as well.

bentito avatar Feb 26 '25 22:02 bentito

Why would this be specific to GCP? Wouldn't this be useful for Amazon Cloud, Azure, IBM Cloud, OCI ...

rhatdan avatar Feb 27 '25 14:02 rhatdan

Nothing specific to a cloud provider, just an example. but perhaps would have been clearer if I imagined it as:

ramalama serve-remote <cloud_provider> <model_path>

bentito avatar Feb 27 '25 17:02 bentito

If we were to do this, I would go with some syntax like:

ramalama serve --cloud <cloud_provider> <model_path>

But I think this would be a HUGE amount of work to support multiple cloud vendors and need to embed lots of knowledge about starting containers in the cloud.

  • starting the remove VM
  • somehow getting the MODEL pulled to the remote VM
  • starting a podman.service in the remote VM.
  • Connecting to the remote podman.service to pull and start a container to run the VM.

rhatdan avatar Feb 28 '25 14:02 rhatdan

No movement on this one. I would like to get to the point as we add more services to be able to access remote servers via IP addresses.

rhatdan avatar Apr 02 '25 11:04 rhatdan

I agree it would be a lot of work. But I think the ability to ad-hoc, with the ease of local ramalama serve, fire up a remote endpoint under user control is powerful stuff.

I also think your list of points above are probably the main steps that need doing. I'd like to contribute to this project so I wonder if we could start to form those initial steps into a plan of how the maintainers would like to see the work done? Then break out some issues that I (and maybe other) could work on?

bentito avatar Apr 02 '25 13:04 bentito

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] avatar Jul 25 '25 00:07 github-actions[bot]

Sorry this one got lost in the flow, @bentito we would love to see an example of how this would work.

rhatdan avatar Jul 25 '25 10:07 rhatdan

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] avatar Aug 26 '25 00:08 github-actions[bot]

Since we never got additional information, closing. If new information on requirements then we can reopen the issue.

rhatdan avatar Aug 26 '25 12:08 rhatdan