jan icon indicating copy to clipboard operation
jan copied to clipboard

idea: get the current loaded model(s) through API endpoint

Open GPTLocalhost opened this issue 1 year ago • 6 comments
trafficstars

Problem Statement

Is it possible to get the current loaded model(s) in Jan through API endpoint? The current "http://localhost:1337/v1/models" API shows all "available" models.

Feature Idea

Provide a new API endpoint if necessary.

GPTLocalhost avatar Nov 01 '24 02:11 GPTLocalhost

Jan is beautifully built. The following is copied from OpenAI docs for your quick reference. In my experience integrating other LLM Servers/Gateways, the API returns "currently" available models. As soon as this functionality is enhanced in Jan, I'd be happy to add a demo for Jan on GPTLocalhost so that users have another reason to give Jan a try. Thank you for your consideration.

Lists the currently available models, and provides basic information about each one such as the owner and availability.

GPTLocalhost avatar Nov 18 '24 01:11 GPTLocalhost

UPDATE:

Jan should expose the current loaded models endpoint (with better API path) /inferences/server/models

louis-jan avatar Apr 12 '25 03:04 louis-jan

/inferences/server/models

Thanks for the update. May I confirm whether it has been implemented? I tried the latest version (v0.5.16) but found the above API does not exist.

GPTLocalhost avatar Apr 12 '25 09:04 GPTLocalhost

Hi @GPTLocalhost I will take a look and sneak in the update to the upcoming next release. For now it's supported but not exposed for public access. (API Key - cortex.cpp restrcited)

louis-jan avatar Apr 12 '25 11:04 louis-jan

It should be supported via this endpoint. Unfortunately, there's a bug introduced when we filter out undownloaded models from the hub. cc @samhvw8 curl http://localhost:1337/inferences/server/models -H "Authorization:...

louis-jan avatar Jun 22 '25 09:06 louis-jan

Thank you for the update. Unfortunately, our attempt yesterday was unsuccessful. Once the bug is resolved, we intend to produce a demo showcasing how to use Jan with Microsoft Word as planned.

Additionally, we're curious about any future plans to integrate Apple Intelligence within Jan. We already have a similar demo as below and would appreciate it if Jan could access Apple's Foundation Language Models.

https://youtu.be/BBr2gPr-hwA

GPTLocalhost avatar Jun 22 '25 11:06 GPTLocalhost

It should be supported via this endpoint. Unfortunately, there's a bug introduced when we filter out undownloaded models from the hub. cc @samhvw8 curl http://localhost:1337/inferences/server/models -H "Authorization:...

@samhvw8 can you verify if this endpoint works? If so, pls close this issue. If not, please turn this into a bug and add it to your backlog 🙏

freelerobot avatar Jul 02 '25 06:07 freelerobot

@freelerobot yes, it current bug right now, let me fix it soon

samhvw8 avatar Jul 03 '25 00:07 samhvw8

Hi @GPTLocalhost can I check with you if this issue still presence?

LazyYuuki avatar Aug 04 '25 07:08 LazyYuuki

"http://localhost:1337/v1/models" works now, with API key. Thank you very much. However, I tested "/completions" using gpt-oss-20b and it shows the following. Is there any way to filter out the tags and reasoning traces?

<|channel|>analysis<|message|>We need to rewrite a simple test message. It's a very short text: "This is a test message." We need to rewrite it to be more simple, clear, better flow. But it's already simple. Maybe rewrite as "This is a test." Or "Here is a test message." But the instruction: rewrite the following text to make it more simple, clear, with better flow. So we can produce a simple, clear sentence. Let's produce: "This is a test message." Already simple. Maybe "This is a test." The content might remain same meaning. Provide alternative? Provide rewrite. We'll deliver.<|start|>assistant<|channel|>final<|message|>This is a test.

GPTLocalhost avatar Aug 09 '25 13:08 GPTLocalhost

hey @GPTLocalhost if you use through API layer, then it is up to you to have to parse that information because llama.cpp upstream doesn't properly implement the reasoning_content portion yet.

But thanks for the update, I will close this bug now

LazyYuuki avatar Aug 09 '25 13:08 LazyYuuki

I see. Thank you for the info.

GPTLocalhost avatar Aug 09 '25 13:08 GPTLocalhost