Yi Xu comments

Results 12 comments of


                                            Yi Xu

Installing via helm - backend unable to connect to postgres

Here are my services: ``` $ kubectl get svc | grep modeldb modeldb-staging-f8780e-backend ClusterIP 100.71.14.248 8085/TCP,8086/TCP,3000/TCP 150m modeldb-staging-f8780e-graphql ClusterIP 100.67.143.195 3000/TCP 150m modeldb-staging-f8780e-postgresql ClusterIP 100.68.84.251 5432/TCP 150m modeldb-staging-f8780e-postgresql-headless ClusterIP None...

Installing via helm - backend unable to connect to postgres

Ok, changing `--name modeldb` for the release did the trick. I've port forwarded the webapp to my `localhost:3000`, and I think the only remaining problem is that on the 'Repositories'...

Installing via helm - backend unable to connect to postgres

Ok cool, but let me make sure I have everything working first 😅 In addition to removing the double dash, I had to move the `{{- if .Values.env }}` on...

self host on runnpod

Hi @Stealthwriter , we currently don't have any plans on supporting runpod. LLM Engine works on k8s, so if runpod supports k8s, then it can _almost_ work - we'd need...

Control frequency - completion

Hi @Stealthwriter , thanks for reaching out. Yes, you can basically route any API changes through to the underlying inference framework(s) we use, assuming they support the fields you need....

Control frequency - completion

You can think of LLM Engine as adding 1) a set of higher-level abstractions (e.g. APIs are expressed in terms of Completions and Fine-tunes) and 2) autoscaling via k8s. TGI...

[Feature Request] support InternLM

Hi @JimmyMa99, thanks for reaching out! You're welcome to submit a PR to add the InternLM model to LLM Engine. We're in the process of creating documentation for this, but...

Create guide for how to deploy an existing Hugging Face model on self-hosted LLM Engine

We're currently wrapping up some testing for a self-contained `helm install` on your own EKS cluster. Once that's ready, we'll ship the docs too.

Create guide for how to deploy an existing Hugging Face model on self-hosted LLM Engine

Btw just want to clarify and acknowledge that https://github.com/scaleapi/llm-engine/pull/153 solves part but not all of the ask - it shows you how to deploy a self-hosted model for an existing...

max token length for finetune and completion endpoints on Lllama-2?

Hi @urimerhav, thanks for reaching out. Here are some answers to your questions: > Originally Meta released it with a bug that caused max length to be 2048 while the...