FastChat how to serve Gradio app on the cloud and run inference locally

how to serve Gradio app on the cloud and run inference locally

Open ouhenio opened this issue 1 year ago • 0 comments

Hi!

I'm building my own LLM, and I would like to serve it with FastChat. My idea is to deploy the Gradio App on AWS or GCP, and run the LLM inference locally from my own cluster. Is this possible, how could I setup something like this? Were would I need to run the controller?

Jun 17 '24 16:06 ouhenio

FastChat FastChat copied to clipboard

how to serve Gradio app on the cloud and run inference locally

FastChat
FastChat copied to clipboard