ollama-js icon indicating copy to clipboard operation
ollama-js copied to clipboard

JSON mode fails with "Expected a Completed Response"

Open meni3a opened this issue 1 month ago • 7 comments

Code:

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'llama3:8b',
  format: 'json',
  messages: [{ role: 'user', content: 'What color is the sky at different times of the day? Respond using a JSON' }],
})
console.log(response.message.content)

Error:

     throw new Error("Expected a completed response.");
              ^

Error: Expected a completed response.
    at Ollama.processStreamableRequest 

When I remove the format: 'json' everything works as expected.

meni3a avatar May 05 '24 13:05 meni3a

Sometimes small LLMs struggle to provide a proper JSON, especially when you ask just "give me a JSON" without further instructions. Try to provide in a request some examples:

What color is the sky at different times of the day? Respond using a JSON.

Response example: 
{
   "morning": color_a,
   "day": color_b,
   "evening": color_c
}

Read about some prompt techniques here https://www.promptingguide.ai/techniques/fewshot (just dropped a random link from the internet).

For single request tasks try to use ollama.generate and modify the system property, to try to instruct model behave in a more restricted way, with the temperature of 0.

Pixelycia avatar May 06 '24 19:05 Pixelycia

@Pixelycia Thanks for responding. In this case, it doesn't seem the issue is with the model itself. Using the Ollama API, the model works as expected. for example:

curl http://localhost:11434/api/generate -d '{
  "model": "llama3-gradient",
  "prompt": "What color is the sky at different times of the day? Respond using JSON",
  "format": "json",
  "stream": false
}'

Result:

{
    "model": "llama3-gradient",
    "created_at": "2024-05-07T10:36:49.427165Z",
    "response": "{ \"morning\": \"Light blue\",\n\"daytime\": \"Bright blue\",\n\"afternoon\": \"Golden yellow\",\n\"evening\": \"Orange\",\n\"night\": \"Dark purple\"} ",
    "done": false
}

Only when I use the ollama-js I get the error I mentioned above: "Expected a Completed Response".

meni3a avatar May 07 '24 10:05 meni3a

Yes, the problem is how LLM "learns" things, there is no rules for LLMs that "this JSON is valid" but "this is not" on the training stage, they are learning things unsupervised just because they saw some example, and more ofter model sees examples like previous one on the training stage - more it learn patterns, but it does not mean that it learns perfectly well, especially if we talking about models that has 8 billion parameters, comparative to ChatGPT 175 billion parameters, so it can hold ~21x less information in it's weights (very roughly speaking). If you do not set the temperature (i.e. randomness) to lower values and do not set the request that restricts output of your request - because of how probability works you can get random answer each time you execute model, and it could look like to yourself that in one cases it works perfectly fine, in others - not or in other words "unlucky" when used "ollama.js" you got an invalid JSON that could not be parsed (let's say it has extra curly bracket, but nobody taught model that this is incorrect, it learned by itself the best it can). So help the model to give a right answer - the easiest solution is to provide extra examples to your request. On the other hand you can try to use specialised models that were trained to solve specific tasks like this, and they could generate more reliable answer without extra tweak.

It's a hard topic, and it worth to know (at least basics) how LLMs work, what they can do, what they don't and how this "don't" turn into what you want, but do not create miracles, they are not perfect yet.

Pixelycia avatar May 07 '24 12:05 Pixelycia

Just double checked, with a few-shot prompt like this:

What color is the sky at different times of the day? Respond using JSON format.
Response example: 
{
  date_time: color_a,
}

I got a valid JSON response with { "format": "json" }:

{
    "sunrise": "pinkish-orange",
    "morning": "blue",
    "afternoon": "slightly hazy blue",
    "evening": "reddish-purple",
    "night": "black"
}

Probably you got "Expected a Completed Response" because model could generate (or in opposite missed to generate) some "special tokens" inside a response, os it prevents ollama from proper request handling. As you see, with the prompt engineering we can solve this issue.

What else you can do - use default format and parse JSON manually after you got a response.

Pixelycia avatar May 07 '24 12:05 Pixelycia

Another cool suggestion - try to avoid JSON in prompt, because usually LLM good in python, try to ask: What color is the sky at different times of the day? Respond using python dictionary format., it usually performs MUCH better and it works perfectly fine with JSON parse function.

Pixelycia avatar May 07 '24 13:05 Pixelycia

@Pixelycia It's indeed possible to solve this with a little bit of code engineering. but I'm trying to take advantage of format: 'json' option of Ollama. every time I put format: 'json' in my ollama-js project I get an error, and when I use it directly in the model through the CLI or the API - I get the response successfully. it's the same exact model instance (running on my local machine) with the exact same prompt. I tried it many times and got the same results. I don't think it is related to LLM not being consistent.. it looks like some an issue with ollama-js.

meni3a avatar May 08 '24 07:05 meni3a

I cloned ollama-js and deleted line 92 in the browser.ts file:

throw new Error('Expected a completed response.')

And now it works perfectly.

it seems like there is an issue with the condition when using format: 'json'.

@samwillis I would appreciate it if you could look at this part of the code: src/browser.ts in processStreamableRequest method line 92

 if (!message.value.done && (message.value as any).status !== 'success') {
     throw new Error('Expected a completed response.')
 }

I see that message.value.done always equals false in json format.

meni3a avatar May 08 '24 10:05 meni3a