continue Query server for model availability of OpenAI-compatible servers

Validations

[X] I believe this is a way to improve. I'll try to join the Continue Discord for questions
[X] I'm not able to find an open issue that requests the same enhancement

Problem

OpenAI-compatible inference servers have the /v1/models endpoint which lists their models and capabilities. e.g.:

curl --header "Authorization: Bearer MY_OPENAI_KEY"   https://api.openai.com/v1/models

This returns a JSON list of the models and some of their capabilities. Compatible servers implement the same.

This would make the configuration of such servers much easier. Instead of writing the config.py manually, one can just query the server for the models...

Solution

No response

Nov 07 '23 16:11 surak

To be sure, the use case here is to populate the dropdown if you're using the OpenAI class with a personal API token?

Nov 07 '23 20:11 sestinj

That or any OpenAI model server, where models can have completely different names from that.

but yea, the autopopulate part is correct

Nov 07 '23 20:11 surak

i tried using continue with a small team, and it was horrendous to set up, especially since we had to switch models a few times. A way to automatically set things up would be great. e.g. when using TGI there's a /info endpoint that gives model name, context window, and more information. this could be used to configure the prompt template, special tokens, and so on automatically by only having the user select TGI + entering the URL.

If I can help let me know

Nov 09 '23 14:11 simon376

@simon376 Thanks for the honesty here. Model setup is something we're planning to spend most of next week on, so if you don't mind I'd actually love to hear the entirety of your experience. Is GitHub issues the best way to communicate, Discord, or even a quick call? Whatever is most comfortable. I'll have many questions and can hopefully share some early ideas with you

Our goal is like you said, to get this to a point where you wouldn't need to worry about the config file at all

Nov 09 '23 21:11 sestinj

I will message you on discord when I find the time. my choice of words was a bit harsh tbh 😅 maybe I was a bit ill-prepared

Nov 10 '23 11:11 simon376

No worries! "horrendous" was properly descriptive in my opinion : )

Many of the noted improvements are now available in the pre-release. Still yet to autodetect available models, but I still think this is a good idea

Nov 16 '23 20:11 sestinj

I am using the pre-release version v.09.80 now.

At some point I actually got it running! Wow!

Now, I can't get it. I am writing a guide for my users, and tried doing it step-by-step, but couldn't get it to probe.

Some things I noticed when trying it:

The "autodetect" option defaults to localhost:8000 and opens the config.json - It could be a step where you actually add the URL of the inference server (this is hidden on the advanced part).
There's no parameter to add an api key. I would argue that if the server probed beforehand returns a negative talking about an api key, a further field should open. ONLY THEN present the user with the list of models from said server.

If I add a model with the name AUTODETECT and no title, like this:

    {
      "model": "AUTODETECT",
      "apiKey": "HERE_GOES_MY_TOKEN",
      "apiBase": "https://helmholtz-blablador.fz-juelich.de:8000/v1",
      "completionOptions": {},
      "provider": "openai"
    }

I end up with a model named "custom" on the model list, which points to... I don't know whom. Its response don't match my models.

Mar 08 '24 18:03 surak

continue continue copied to clipboard

Query server for model availability of OpenAI-compatible servers

Validations

Problem

Solution

continue
continue copied to clipboard