litellm [Bug]: function calling not working for ollama/gemma:7b

What happened?

A bug happened!

it return the result

{
    "id": "chatcmpl-10202250-6285-41fd-896e-ff11d7982d69",
    "choices": [
        {
            "finish_reason": "stop",
            "index": 0,
            "message": {
                "content": null,
                "role": "assistant",
                "tool_calls": [
                    {
                        "id": "call_a789c755-cbfc-4abf-b9e7-434c64608211",
                        "function": {
                            "arguments": "{\n  \"name\": \"get_current_weather\",\n  \"description\": \"Get the current weather in a given location\",\n  \"parameters\": {\n    \"type\": \"object\",\n    \"properties\": {\n      \"location\": {\n        \"type\": \"string\",\n        \"description\": \"The city and state, e.g. San Francisco, CA\"\n      },\n      \"unit\": {\n        \"type\": \"string\",\n        \"enum\": [\"celsius\", \"fahrenheit\"]\n      }\n    },\n    \"required\": [\"location\"]\n  }\n}",
                            "name": ""
                        },
                        "type": "function"
                    }
                ]
            }
        }
    ],
    "created": 1708994428,
    "model": "ollama/gemma:7b",
    "object": "chat.completion",
    "system_fingerprint": null,
    "usage": {
        "prompt_tokens": 119,
        "completion_tokens": 132,
        "total_tokens": 251
    }
}

Relevant log output

The start CLI

$ litellm --model ollama/gemma:7b --api_base http://localhost:11434

The we call it via curl

$ curl --location 'http://192.168.0.27:8000/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
  "model": "ollama/gemma",
  "messages": [
    {"role": "user", "content": "What'\''s the weather like in San Francisco"}
  ],
  "functions": [
    {
      "name": "get_current_weather",
      "description": "Get the current weather in a given location",
      "parameters": {
        "type": "object",
        "properties": {
          "location": {
            "type": "string",
            "description": "The city and state, e.g. San Francisco, CA"
          },
          "unit": {
            "type": "string",
            "enum": ["celsius", "fahrenheit"]
          }
        },
        "required": ["location"]
      }
    }
  ]
}'

{"id":"chatcmpl-1ad50ff9-1817-4271-9bd0-43cbb7dc839f","choices":[{"finish_reason":"stop","index":0,"message":{"content":null,"role":"assistant","tool_calls":[{"id":"call_097a25c8-6c60-49ad-a5be-8549e6f00eb0","function":{"arguments":"{\n  \"name\": \"get_current_weather\",\n  \"description\": \"Get the current weather in a given location\",\n  \"parameters\": {\n    \"type\": \"object\",\n    \"properties\": {\n      \"location\": {\n        \"type\": \"string\",\n        \"description\": \"The city and state, e.g. San Francisco, CA\"\n      },\n      \"unit\": {\n        \"type\": \"string\",\n        \"enum\": [\"celsius\", \"fahrenheit\"]\n      }\n    },\n    \"required\": [\"location\"]\n  }\n}","name":""},"type":"function"}]}}],"created":1708995801,"model":"ollama/gemma:7b","object":"chat.completion","system_fingerprint":null,"usage":{"prompt_tokens":123,"completion_tokens":132,"total_tokens":255}}



### Twitter / LinkedIn details

_No response_

Feb 27 '24 01:02 code959437957

What's the error? @code959437957

this look like it worked:

{"id":"chatcmpl-1ad50ff9-1817-4271-9bd0-43cbb7dc839f","choices":[{"finish_reason":"stop","index":0,"message":{"content":null,"role":"assistant","tool_calls":[{"id":"call_097a25c8-6c60-49ad-a5be-8549e6f00eb0","function":{"arguments":"{\n "name": "get_current_weather",\n "description": "Get the current weather in a given location",\n "parameters": {\n "type": "object",\n "properties": {\n "location": {\n "type": "string",\n "description": "The city and state, e.g. San Francisco, CA"\n },\n "unit": {\n "type": "string",\n "enum": ["celsius", "fahrenheit"]\n }\n },\n "required": ["location"]\n }\n}","name":""},"type":"function"}]}}],"created":1708995801,"model":"ollama/gemma:7b","object":"chat.completion","system_fingerprint":null,"usage":{"prompt_tokens":123,"completion_tokens":132,"total_tokens":255}}

Feb 27 '24 15:02 krrishdholakia

What's the error? @code959437957

this look like it worked:

{"id":"chatcmpl-1ad50ff9-1817-4271-9bd0-43cbb7dc839f","choices":[{"finish_reason":"stop","index":0,"message":{"content":null,"role":"assistant","tool_calls":[{"id":"call_097a25c8-6c60-49ad-a5be-8549e6f00eb0","function":{"arguments":"{\n "name": "get_current_weather",\n "description": "Get the current weather in a given location",\n "parameters": {\n "type": "object",\n "properties": {\n "location": {\n "type": "string",\n "description": "The city and state, e.g. San Francisco, CA"\n },\n "unit": {\n "type": "string",\n "enum": ["celsius", "fahrenheit"]\n }\n },\n "required": ["location"]\n }\n}","name":""},"type":"function"}]}}],"created":1708995801,"model":"ollama/gemma:7b","object":"chat.completion","system_fingerprint":null,"usage":{"prompt_tokens":123,"completion_tokens":132,"total_tokens":255}}

I don't know it is a problem about litellm, or Gemma model.

the return should be like

{
  "id": "chatcmpl-123",
  "...": "...",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": null,
      "function_call": {
        "name": "get_current_weather",
        "arguments": "{ \"location\": \"Boston, MA\"}"
      }
    },
    "finish_reason": "function_call"
  }]
}

do you see it ? the chatgpt will response right function call with arguments: location just Boston,MA

Feb 28 '24 01:02 code959437957

It may be a lost feature of Google Gemma mode, see the original thread: https://huggingface.co/google/gemma-7b/discussions/38

Feb 28 '24 02:02 code959437957

Actually, I see this with any function-capable model with the lastest LiteLLM model @krrishdholakia. I tried several (actually, I did not try Gemma😉).

Always we get that wrong nested result like described above.

Please also see https://github.com/ShishirPatil/gorilla/issues/247#issuecomment-2007773594

Mar 19 '24 17:03 ChristianWeyer

Looking at this on openai's site - https://platform.openai.com/docs/api-reference/chat/create Screenshot 2024-03-19 at 11 41 13 AM

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1699896916,
  "model": "gpt-3.5-turbo-0125",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "id": "call_abc123",
            "type": "function",
            "function": {
              "name": "get_current_weather",
              "arguments": "{\n\"location\": \"Boston, MA\"\n}"
            }
          }
        ]
      },
      "logprobs": null,
      "finish_reason": "tool_calls"
    }
  ],
  "usage": {
    "prompt_tokens": 82,
    "completion_tokens": 17,
    "total_tokens": 99
  }
}

I believe this format is being followed, unless i'm missing something?

If i'm wrong can someone share a formatted litellm response vs. the expected one?

cc: @ChristianWeyer @code959437957

Mar 19 '24 18:03 krrishdholakia

Not quite @krrishdholakia This is what I see via LiteLLM:

"tool_calls" : [
   {
      "function" : {
         "arguments" : "{\n    \"name\": \"get_current_weather\", \n    \"arguments\": {\"location\": \"Boston, MA\"}\n}\n",
         "name" : ""
      },
      "id" : "call_7e88f79b-b4d7-4f42-8c0d-363414ff6e08",
      "type" : "function"
   }
]

name is empty. And the actual response is nested inside arguments.

Subtle ☺️.

Mar 19 '24 18:03 ChristianWeyer

Maybe the issue is here @krrishdholakia:

https://github.com/BerriAI/litellm/blob/4913ad41db9f10261790917ccdf73dbb535c0366/litellm/llms/ollama.py#L221

https://github.com/BerriAI/litellm/blob/4913ad41db9f10261790917ccdf73dbb535c0366/litellm/llms/ollama.py#L318

Mar 19 '24 19:03 ChristianWeyer

BTW: I also think that it should be "finish_reason" : "tool_calls"

With LiteLLM it is "stop".

Mar 19 '24 20:03 ChristianWeyer

I think this was resolved in v1.35.34+ by PR https://github.com/BerriAI/litellm/pull/1526 as discussed in related issue https://github.com/BerriAI/litellm/issues/3333 . Requires using the ollama_chat/ prefix in place of ollama/. Streaming responses remain broken.

May 04 '24 06:05 jackmpcollins

I think this was resolved in v1.35.34+ by PR #1526 as discussed in related issue #3333 . Requires using the ollama_chat/ prefix in place of ollama/. Streaming responses remain broken.

Thanks for the heads-up! That PR does not fix the wrong finish_reason issue, however.

... still wondering why my PR had not been accepted ... @krrishdholakia

May 05 '24 07:05 ChristianWeyer

Hey @ChristianWeyer Which PR are you referring to? I might've missed it.

We have finish reason mapping here - https://github.com/BerriAI/litellm/blob/918367cc7bdc9e8e01477243ebc963709ac8178e/litellm/utils.py#L188

May 05 '24 20:05 krrishdholakia

Hey @ChristianWeyer Which PR are you referring to? I might've missed it.

We have finish reason mapping here -

https://github.com/BerriAI/litellm/blob/918367cc7bdc9e8e01477243ebc963709ac8178e/litellm/utils.py#L188

This: https://github.com/BerriAI/litellm/pull/2597

May 06 '24 14:05 ChristianWeyer