genaiscript bug: Ollama API incorrectly handled by GenAIScript (causes 404s)

When trying to follow the introduction docs using GenAIScript with any Ollama models fails.

This seems to stem from two issues:

Making invalid calls to the ollama host, it is not providing the correct Ollama API path (/api).

The standard way of setting the Ollama host is through the OLLAMA_HOST environment variable, for example:

OLLAMA_HOST=http://localhost:11434

When an application makes a call to the Ollama API, you would expect to see requests to $OLLAMA_HOST/api/<API Method>, for example:

[GIN] 2024/11/02 - 22:29:56 | 200 |  1.194048589s |   192.168.0.213 | POST     "/api/pull"

However - when GenAIScript makes calls to Ollama it appears to be hitting the base URL without the /api path:

[GIN] 2024/11/02 - 22:29:56 | 404 |         4.8µs |   192.168.0.213 | POST     "/chat/completions"

This should be /api/generate, for example:

curl http://localhost:11434/api/generate -d '{
  "model": "qwen2.5:7b-instruct-q8_0",
  "prompt": "Why is the sky blue?"
}'

Ollama API docs https://github.com/ollama/ollama/blob/main/docs/api.md

Not actually supporting Ollama, but instead using the OpenAI compatible API endpoint.

Looking at the traces generated from the failed requests I see:

-   model: ollama:qwen2.5:7b-instruct-q8_0
-   source: env: OLLAMA_API_...
-   provider: ollama
-   temperature: 0.8
-   base: https://ollama.my.internal.domain
-   type: openai

This suggests GenAIScript is not using the Ollama API, but instead the OpenAI compatible API.

The Ollama API exists at http(s)://ollama-hostname:port/api, This is recommended to be used as it's the native API that supports all functionality.

Ollama also provides an OpenAI compatible API for applications that only support OpenAI with basic functionality at http(s)://ollama-hostname:port/v1

This is only recommended to be used as a last resort for applications that aren't compatible with Ollama and does not provide all features, see https://github.com/ollama/ollama/blob/main/docs/openai.md

While using the OpenAI compatible API endpoint is not ideal, it should work for basic generation tasks, however the correct API path of /v1 should be used, e.g.

curl https://ollama.my.internal.domain/v1/chat/completions -H "Content-Type: application/json" -d '{
              "messages": [{"role": "user", "content": "Why is the sky blue?"}],
              "model": "qwen2.5:7b-instruct-q8_0",
            }'

environment for reference:

OLLAMA_HOST=https://ollama.my.internal.domain
GENAISCRIPT_DEFAULT_MODEL="ollama:qwen2.5:7b-instruct-q8_0"
GENAISCRIPT_DEFAULT_SMALL_MODEL="ollama:qwen2.5:7b-instruct-q8_0"

Nov 02 '24 22:11 sammcj

Hey there, OLLAMA_HOST does not work for me either, but, I used OLLAMA_API_BASE instead, and it was good. Also, note that genaiscript seems to be using the openAI style api that ollama serves on the /v1/... routes and not /api/... So for your confid, try somethin along the line of :

OLLAMA_API_BASE=https://<domain or ip>:<port>/v1
GENAISCRIPT_DEFAULT_MODEL="ollama:qwen2.5:7b-instruct-q8_0"
GENAISCRIPT_DEFAULT_SMALL_MODEL="ollama:qwen2.5:7b-instruct-q8_0"

For me it is OLLAMA_API_BASE=http://192.168.1.42:42069/v1, hope that helped

Nov 03 '24 12:11 DimitriGilbert

Thanks for the details. We'll look into this.

Nov 03 '24 21:11 pelikhan

part 1 fixed in 1.71.0 @sammcj @DimitriGilbert

Nov 04 '24 15:11 pelikhan

@sammcj do you know models that don't work with the OpenAI company layer?

Nov 07 '24 16:11 pelikhan

OpenAI company layer

Do you mean OpenAI compatible API?

If so, all models work with it, but the native API is better as it supports all features like hot model loading, setting the context size etc...

Nov 07 '24 22:11 sammcj

I see. Thanks for the clarification.

Nov 08 '24 04:11 pelikhan

Native ollama tracking in #825

Dec 06 '24 21:12 pelikhan

@sammcj I'm closing this as it seems that Ollama OpenAI support is pretty robust. Is there a particular feature you are missing?

Dec 06 '24 21:12 pelikhan