LocalAI Stuck in Function Call Loop

LocalAI version: docker image: localai/localai:v2.22.0-aio-gpu-nvidia-cuda-11

Environment, CPU architecture, OS, and Version: docker on debian, intel i9, nvidia gpu

Describe the bug When using functions, AI stuck in a loop of function calls. It seems, as it does not understand the tool result. As documented everywhere after getting a [TOOL_RESULT], the model should process the result and answer to the user as assistant role and not run the same function call again and again...

I'm not sure if this is an issue of localai, the model or the chat template? I thought that maybe the tool_call_id is missing, so the model is not able to connect the tool result to the function call.

Any ideas?

To Reproduce use this api call with v1/chat/completions:

{
	"messages": [
		{
			"role": "system",
			"content": "You are an assistant that helps to transform text into special language."
		},
		{
			"role": "user",
			"content": "Transform this text: ExampleText"
		},
		{
			"role": "assistant",
			"content": "",
			"tool_calls": [
				{
					"id": "06b05978-b3a4-463e-b21f-127bdabb4953",
					"index": 0,
					"type": "function",
					"function": {
						"name": "TestOpenAi_MyToolClass_TransformToSpecialLanguage",
						"arguments": "{\"text\":\"ExampleText\"}"
					}
				}
			]
		},
		{
			"name": "TestOpenAi_MyToolClass_TransformToSpecialLanguage",
			"role": "tool",
			"content": "nqwprpok--ExampleText-nqwprpok",
			"tool_call_id": "06b05978-b3a4-463e-b21f-127bdabb4953"
		}
	],
	"model": "gpt-4",
	"response_format": {
		"type": "text"
	},
	"tools": [
		{
			"type": "function",
			"function": {
				"name": "TestOpenAi_MyToolClass_TransformToSpecialLanguage",
				"parameters": {
					"type": "object",
					"properties": {
						"text": {
							"type": "string",
							"description": "The text to transform"
						}
					},
					"required": [
						"text"
					],
					"additionalProperties": false
				},
				"strict": true
			}
		}
	],
	"tool_choice": "auto"
}

The response is now the same function call again:

{
    "created": 0,
    "object": "chat.completion",
    "id": "d925ce6d-11f6-4e79-a8c4-5fe4a321a3f6",
    "model": "gpt-4",
    "choices": [{
        "index": 0,
        "finish_reason": "tool_calls",
        "message": {
            "role": "assistant",
            "content": "",
            "tool_calls": [{
                "index": 0,
                "id": "d925ce6d-11f6-4e79-a8c4-5fe4a321a3f6",
                "type": "function",
                "function": {
                    "name": "TestOpenAi_MyToolClass_TransformToSpecialLanguage",
                    "arguments": "{\"text\":\"ExampleText\"}"
                }
            }]
        }
    }],
    "usage": {
        "prompt_tokens": 190,
        "completion_tokens": 27,
        "total_tokens": 217
    }
}

Expected behavior The response should be from an assistant role that processes the tool/function result.

Logs

12:31PM DBG Request received: {"model":"gpt-4","language":"","translate":false,"n":0,"top_p":null,"top_k":null,"temperature":null,"max_tokens":null,"echo":false,"batch":0,"ignore_eos":false,"repeat_penalty":0,"repeat_last_n":0,"n_keep":0,"frequency_penalty":0,"presence_penalty":0,"tfz":null,"typical_p":null,"seed":null,"negative_prompt":"","rope_freq_base":0,"rope_freq_scale":0,"negative_prompt_scale":0,"use_fast_tokenizer":false,"clip_skip":0,"tokenizer":"","file":"","response_format":{"type":"text"},"size":"","prompt":null,"instruction":"","input":null,"stop":null,"messages":[{"role":"system","content":"You are an assistant that helps to transform text into special language."},{"role":"user","content":"Transform this text: ExampleText"},{"role":"assistant","content":"","tool_calls":[{"index":0,"id":"06b05978-b3a4-463e-b21f-127bdabb4953","type":"function","function":{"name":"TestOpenAi_MyToolClass_TransformToSpecialLanguage","arguments":"{\"text\":\"ExampleText\"}"}}]},{"role":"tool","name":"TestOpenAi_MyToolClass_TransformToSpecialLanguage","content":"nqwprpok--ExampleText-nqwprpok"}],"functions":null,"function_call":null,"tools":[{"type":"function","function":{"name":"TestOpenAi_MyToolClass_TransformToSpecialLanguage","description":"","strict":true,"parameters":{"additionalProperties":false,"properties":{"text":{"description":"The text to transform","type":"string"}},"required":["text"],"type":"object"}}}],"tool_choice":"auto","stream":false,"mode":0,"step":0,"grammar":"","grammar_json_functions":null,"backend":"","model_base_name":""}
12:31PM DBG guessDefaultsFromFile: template already set name=gpt-4
12:31PM DBG Configuration read: &{PredictionOptions:{Model:gpt-4.gguf Language: Translate:false N:0 TopP:0xc0014bbac8 TopK:0xc0014bbad0 Temperature:0xc0014bbad8 Maxtokens:0xc0014bbb08 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 RepeatLastN:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc0014bbb00 TypicalP:0xc0014bbaf8 Seed:0xc0014bbb20 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:gpt-4 F16:0xc0014bbac0 Threads:0xc0014bbab8 Debug:0xc001686940 Roles:map[] Embeddings:0xc0014bbb19 Backend: TemplateConfig:{Chat:{{.Input -}}
 ChatMessage:{{if eq .RoleName "user" -}}
[INST] {{.Content }} [/INST]
{{- else if .FunctionCall -}}
[TOOL_CALLS] {{toJson .FunctionCall}} [/TOOL_CALLS]
{{- else if eq .RoleName "tool" -}}
[TOOL_RESULTS] {{.Content}} [/TOOL_RESULTS]
      
{{- else -}}
{{ .Content -}}
{{ end -}} Completion:{{.Input}}
 Edit: Functions:[AVAILABLE_TOOLS] [{{range .Functions}}{"type": "function", "function": {"name": "{{.Name}}", "description": "{{.Description}}", "parameters": {{toJson .Parameters}} }}{{end}} ] [/AVAILABLE_TOOLS]{{.Input }} UseTokenizerTemplate:false JoinChatMessagesByCharacter:0xc0014c9ed0 Video: Image: Audio:} KnownUsecaseStrings:[] KnownUsecases:<nil> PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: ResponseFormat: ResponseFormatMap:map[type:text] FunctionsConfig:{DisableNoAction:true GrammarConfig:{ParallelCalls:true DisableParallelNewLines:true MixedMode:false NoMixedFreeString:false NoGrammar:true Prefix: ExpectStringsAfterJSON:false PropOrder: SchemaType:} NoActionFunctionName: NoActionDescriptionName: ResponseRegex:[] JSONRegexMatch:[(?s)\[TOOL\_CALLS\](.*)] ReplaceFunctionResults:[{Key:(?s)^[^{\[]* Value:} {Key:(?s)[^}\]]*$ Value:} {Key:(?s)\[TOOL\_CALLS\] Value:} {Key:(?s)\[\/TOOL\_CALLS\] Value:}] ReplaceLLMResult:[] CaptureLLMResult:[] FunctionNameKey: FunctionArgumentsKey:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc0014bbaf0 MirostatTAU:0xc0014bbae8 Mirostat:0xc0014bbae0 NGPULayers:0xc0014bbb10 MMap:0xc0014bba68 MMlock:0xc0014bbb19 LowVRAM:0xc0014bbb19 Grammar: StopWords:[<|im_end|> <dummy32000> </tool_call> <|eot_id|> <|end_of_text|> </s> [/TOOL_CALLS] [/ACTIONS]] Cutstrings:[] ExtractRegex:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc0014bba70 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: FlashAttention:false NoKVOffloading:false RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} TTSConfig:{Voice: VallE:{AudioPath:}} CUDA:false DownloadFiles:[] Description: Usage:}
12:31PM DBG Response needs to process functions
12:31PM DBG Parameters: &{PredictionOptions:{Model:gpt-4.gguf Language: Translate:false N:0 TopP:0xc0014bbac8 TopK:0xc0014bbad0 Temperature:0xc0014bbad8 Maxtokens:0xc0014bbb08 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 RepeatLastN:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc0014bbb00 TypicalP:0xc0014bbaf8 Seed:0xc0014bbb20 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:gpt-4 F16:0xc0014bbac0 Threads:0xc0014bbab8 Debug:0xc001686940 Roles:map[] Embeddings:0xc0014bbb19 Backend: TemplateConfig:{Chat:{{.Input -}}
 ChatMessage:{{if eq .RoleName "user" -}}
[INST] {{.Content }} [/INST]
{{- else if .FunctionCall -}}
[TOOL_CALLS] {{toJson .FunctionCall}} [/TOOL_CALLS]
{{- else if eq .RoleName "tool" -}}
[TOOL_RESULTS] {{.Content}} [/TOOL_RESULTS]
      
{{- else -}}
{{ .Content -}}
{{ end -}} Completion:{{.Input}}
 Edit: Functions:[AVAILABLE_TOOLS] [{{range .Functions}}{"type": "function", "function": {"name": "{{.Name}}", "description": "{{.Description}}", "parameters": {{toJson .Parameters}} }}{{end}} ] [/AVAILABLE_TOOLS]{{.Input }} UseTokenizerTemplate:false JoinChatMessagesByCharacter:0xc0014c9ed0 Video: Image: Audio:} KnownUsecaseStrings:[] KnownUsecases:<nil> PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: ResponseFormat: ResponseFormatMap:map[type:text] FunctionsConfig:{DisableNoAction:true GrammarConfig:{ParallelCalls:true DisableParallelNewLines:true MixedMode:false NoMixedFreeString:false NoGrammar:true Prefix: ExpectStringsAfterJSON:false PropOrder: SchemaType:} NoActionFunctionName: NoActionDescriptionName: ResponseRegex:[] JSONRegexMatch:[(?s)\[TOOL\_CALLS\](.*)] ReplaceFunctionResults:[{Key:(?s)^[^{\[]* Value:} {Key:(?s)[^}\]]*$ Value:} {Key:(?s)\[TOOL\_CALLS\] Value:} {Key:(?s)\[\/TOOL\_CALLS\] Value:}] ReplaceLLMResult:[] CaptureLLMResult:[] FunctionNameKey: FunctionArgumentsKey:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc0014bbaf0 MirostatTAU:0xc0014bbae8 Mirostat:0xc0014bbae0 NGPULayers:0xc0014bbb10 MMap:0xc0014bba68 MMlock:0xc0014bbb19 LowVRAM:0xc0014bbb19 Grammar:realvalue ::= root-0
space ::= " "?
freestring ::= (
      
			[^\x00] |
			"\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F])
      
		  )* space
string ::= "\"" (
			[^"\\] |
			"\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F])
		  )* "\"" space
root-0-arguments ::= "{" space "\"text\"" space ":" space string "}" space
root-0-name ::= "\"TestOpenAi_MyToolClass_TransformToSpecialLanguage\""
root-0 ::= "{" space "\"arguments\"" space ":" space root-0-arguments "," space "\"name\"" space ":" space root-0-name "}" space
root ::= arr | realvalue
arr  ::=
  "["  (
		realvalue
    (","  realvalue)*
  )? "]"
mixedstring ::= freestring | freestring arr | freestring realvalue | realvalue | arr StopWords:[<|im_end|> <dummy32000> </tool_call> <|eot_id|> <|end_of_text|> </s> [/TOOL_CALLS] [/ACTIONS]] Cutstrings:[] ExtractRegex:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc0014bba70 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: FlashAttention:false NoKVOffloading:false RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} TTSConfig:{Voice: VallE:{AudioPath:}} CUDA:false DownloadFiles:[] Description: Usage:}
12:31PM DBG templated message for chat: You are an assistant that helps to transform text into special language.
12:31PM DBG templated message for chat: [INST] Transform this text: ExampleText [/INST]
12:31PM DBG templated message for chat: [TOOL_CALLS] [{"index":0,"id":"06b05978-b3a4-463e-b21f-127bdabb4953","type":"function","function":{"name":"TestOpenAi_MyToolClass_TransformToSpecialLanguage","arguments":"{\"text\":\"ExampleText\"}"}}] [/TOOL_CALLS]
12:31PM DBG templated message for chat: [TOOL_RESULTS] nqwprpok--ExampleText-nqwprpok [/TOOL_RESULTS]
12:31PM DBG Prompt (before templating): You are an assistant that helps to transform text into special language.[INST] Transform this text: ExampleText [/INST][TOOL_CALLS] [{"index":0,"id":"06b05978-b3a4-463e-b21f-127bdabb4953","type":"function","function":{"name":"TestOpenAi_MyToolClass_TransformToSpecialLanguage","arguments":"{\"text\":\"ExampleText\"}"}}] [/TOOL_CALLS][TOOL_RESULTS] nqwprpok--ExampleText-nqwprpok [/TOOL_RESULTS]
12:31PM DBG Template found, input modified to: [AVAILABLE_TOOLS] [{"type": "function", "function": {"name": "TestOpenAi_MyToolClass_TransformToSpecialLanguage", "description": "", "parameters": {"additionalProperties":false,"properties":{"text":{"description":"The text to transform","type":"string"}},"required":["text"],"type":"object"} }} ] [/AVAILABLE_TOOLS]You are an assistant that helps to transform text into special language.[INST] Transform this text: ExampleText [/INST][TOOL_CALLS] [{"index":0,"id":"06b05978-b3a4-463e-b21f-127bdabb4953","type":"function","function":{"name":"TestOpenAi_MyToolClass_TransformToSpecialLanguage","arguments":"{\"text\":\"ExampleText\"}"}}] [/TOOL_CALLS][TOOL_RESULTS] nqwprpok--ExampleText-nqwprpok [/TOOL_RESULTS]
12:31PM DBG Prompt (after templating): [AVAILABLE_TOOLS] [{"type": "function", "function": {"name": "TestOpenAi_MyToolClass_TransformToSpecialLanguage", "description": "", "parameters": {"additionalProperties":false,"properties":{"text":{"description":"The text to transform","type":"string"}},"required":["text"],"type":"object"} }} ] [/AVAILABLE_TOOLS]You are an assistant that helps to transform text into special language.[INST] Transform this text: ExampleText [/INST][TOOL_CALLS] [{"index":0,"id":"06b05978-b3a4-463e-b21f-127bdabb4953","type":"function","function":{"name":"TestOpenAi_MyToolClass_TransformToSpecialLanguage","arguments":"{\"text\":\"ExampleText\"}"}}] [/TOOL_CALLS][TOOL_RESULTS] nqwprpok--ExampleText-nqwprpok [/TOOL_RESULTS]
12:31PM DBG Grammar: realvalue ::= root-0
space ::= " "?
freestring ::= (
      
			[^\x00] |
			"\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F])
      
		  )* space
string ::= "\"" (
			[^"\\] |
			"\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F])
		  )* "\"" space
root-0-arguments ::= "{" space "\"text\"" space ":" space string "}" space
root-0-name ::= "\"TestOpenAi_MyToolClass_TransformToSpecialLanguage\""
root-0 ::= "{" space "\"arguments\"" space ":" space root-0-arguments "," space "\"name\"" space ":" space root-0-name "}" space
root ::= arr | realvalue
arr  ::=
  "["  (
		realvalue
    (","  realvalue)*
  )? "]"
mixedstring ::= freestring | freestring arr | freestring realvalue | realvalue | arr
12:31PM DBG Model already loaded in memory: gpt-4
12:31PM DBG Checking model availability (gpt-4)
12:31PM DBG Model 'gpt-4' already loaded
12:31PM DBG GRPC(gpt-4-127.0.0.1:46763): stdout {"timestamp":1729341086,"level":"INFO","function":"launch_slot_with_data","line":896,"message":"slot is processing task","slot_id":0,"task_id":399}
12:31PM DBG GRPC(gpt-4-127.0.0.1:46763): stdout {"timestamp":1729341086,"level":"INFO","function":"update_slots","line":1795,"message":"kv cache rm [p0, end)","slot_id":0,"task_id":399,"p0":0}
12:31PM DBG GRPC(gpt-4-127.0.0.1:46763): stdout {"timestamp":1729341087,"level":"INFO","function":"print_timings","line":327,"message":"prompt eval time     =      71.69 ms /   190 tokens (    0.38 ms per token,  2650.37 tokens per second)","slot_id":0,"task_id":399,"t_prompt_processing":71.688,"num_prompt_tokens_processed":190,"t_token":0.37730526315789475,"n_tokens_second":2650.373842205111}
12:31PM DBG GRPC(gpt-4-127.0.0.1:46763): stdout {"timestamp":1729341087,"level":"INFO","function":"print_timings","line":341,"message":"generation eval time =     615.57 ms /    27 runs   (   22.80 ms per token,    43.86 tokens per second)","slot_id":0,"task_id":399,"t_token_generation":615.571,"n_decoded":27,"t_token":22.79892592592593,"n_tokens_second":43.861715382953385}
12:31PM DBG GRPC(gpt-4-127.0.0.1:46763): stdout {"timestamp":1729341087,"level":"INFO","function":"print_timings","line":351,"message":"          total time =     687.26 ms","slot_id":0,"task_id":399,"t_prompt_processing":71.688,"t_token_generation":615.571,"t_total":687.259}
12:31PM DBG GRPC(gpt-4-127.0.0.1:46763): stdout {"timestamp":1729341087,"level":"INFO","function":"update_slots","line":1606,"message":"slot released","slot_id":0,"task_id":399,"n_ctx":8192,"n_past":216,"n_system_tokens":0,"n_cache_tokens":217,"truncated":false}
12:31PM DBG ParseTextContent: [{"arguments":{"text":"ExampleText"},"name":"TestOpenAi_MyToolClass_TransformToSpecialLanguage"}]
12:31PM DBG CaptureLLMResult: []
12:31PM DBG LLM result: [{"arguments":{"text":"ExampleText"},"name":"TestOpenAi_MyToolClass_TransformToSpecialLanguage"}]
12:31PM DBG LLM result(processed): [{"arguments":{"text":"ExampleText"},"name":"TestOpenAi_MyToolClass_TransformToSpecialLanguage"}]
12:31PM DBG LLM result: [{"arguments":{"text":"ExampleText"},"name":"TestOpenAi_MyToolClass_TransformToSpecialLanguage"}]
12:31PM DBG Replacing (?s)^[^{\[]* with 
12:31PM DBG Replacing (?s)[^}\]]*$ with 
12:31PM DBG Replacing (?s)\[TOOL\_CALLS\] with 
12:31PM DBG Replacing (?s)\[\/TOOL\_CALLS\] with 
12:31PM DBG LLM result(function cleanup): [{"arguments":{"text":"ExampleText"},"name":"TestOpenAi_MyToolClass_TransformToSpecialLanguage"}]
12:31PM DBG Function return: [{"arguments":{"text":"ExampleText"},"name":"TestOpenAi_MyToolClass_TransformToSpecialLanguage"}] [map[arguments:map[text:ExampleText] name:TestOpenAi_MyToolClass_TransformToSpecialLanguage]]
12:31PM DBG Text content to return: 
12:31PM DBG Response: {"created":1729341086,"object":"chat.completion","id":"d925ce6d-11f6-4e79-a8c4-5fe4a321a3f6","model":"gpt-4","choices":[{"index":0,"finish_reason":"tool_calls","message":{"role":"assistant","content":"","tool_calls":[{"index":0,"id":"d925ce6d-11f6-4e79-a8c4-5fe4a321a3f6","type":"function","function":{"name":"TestOpenAi_MyToolClass_TransformToSpecialLanguage","arguments":"{\"text\":\"ExampleText\"}"}}]}}],"usage":{"prompt_tokens":190,"completion_tokens":27,"total_tokens":217}}
12:31PM INF Success ip=127.0.0.1 latency=691.053453ms method=POST status=200 url=/v1/chat/completions

Oct 19 '24 12:10 daJuels

you have it on tool choice auto, the model might not know it's completed its task.

Nov 12 '24 18:11 levidehaan

you have it on tool choice auto, the model might not know it's completed its task.

You think it is a model issue? Of course the workaround to remove the tool works for this scenario - but not with multiple tools.

Nov 12 '24 22:11 daJuels

Same error Qwen2.5-Coder-3B-Instruct-Q4_K_M model. This model support tools and work in ollama as expected.

can "tool_choice": "auto" be configured in model.yaml file?

Here is my model file config:

context_size: 4096
name: Qwen2.5-Coder-3B-Instruct-Q4_K_M
threads: 4
parameters:
  model: Qwen2.5-Coder-3B-Instruct-Q4_K_M.gguf
stopwords:
- "<|im_end|>"
- "<dummy32000>"
- "</tool_call>"
- "<|eot_id|>"
- "<|end_of_text|>"

function:
  disable_no_action: true

  grammar:
    disable: true
    #prefix: '<tool_call>\n'
    # parallel_calls: true

  return_name_in_function_response: true
  json_regex_match: 
   - "(?s)<tool_call>(.*?)</tool_call>"
   - "(?s)<tool_call>(.*?)"
  replace_llm_results:
  - key: "(?s)<scratchpad>.*</scratchpad>"
    value: ""
  replace_function_results: 
  - key: '(?s)^[^{\[]*'
    value: ""
  - key: '(?s)[^}\]]*$'
    value: ""
  - key: "'([^']*?)'"
    value: "_DQUOTE_${1}_DQUOTE_"
  - key: '\\"'
    value: "__TEMP_QUOTE__"
  - key: "\'"
    value: "'"
  - key: "_DQUOTE_"
    value: '"'
  - key: "__TEMP_QUOTE__"
    value: '"'
  - key: "(?s)<scratchpad>.*</scratchpad>"
    value: ""

template:
  chat: |
    {{.Input -}}
    <|im_start|>assistant
  chat_message: |
    <|im_start|>{{if eq .RoleName "assistant"}}assistant{{else if eq .RoleName "system"}}system{{else if eq .RoleName "tool"}}tool{{else if eq .RoleName "user"}}user{{end}}
    {{- if .FunctionCall }}
    <tool_call>
    {{- else if eq .RoleName "tool" }}
    <tool_response>
    {{- end }}
    {{- if .Content}}
    {{.Content }}
    {{- end }}
    {{- if .FunctionCall}}
    {{toJson .FunctionCall}}
    {{- end }}
    {{- if .FunctionCall }}
    </tool_call>
    {{- else if eq .RoleName "tool" }}
    </tool_response>
    {{- end }}<|im_end|>
  completion: |
    {{.Input}}
  function: |-
    <|im_start|>system
    You are a function calling AI model.
    Here are the available tools:
    <tools>
    {{range .Functions}}
    {'type': 'function', 'function': {'name': '{{.Name}}', 'description': '{{.Description}}', 'parameters': {{toJson .Parameters}} }}
    {{end}}
    </tools>
    You should call the tools provided to you sequentially
    Please use <scratchpad> XML tags to record your reasoning and planning before you call the functions as follows:
    <scratchpad>
    {step-by-step reasoning and plan in bullet points}
    </scratchpad>
    For each function call return a json object with function name and arguments within <tool_call> XML tags as follows:
    <tool_call>
    {"arguments": <args-dict>, "name": <function-name>}
    </tool_call><|im_end|>
    {{.Input -}}
    <|im_start|>assistant

and here is curl:

{
  "model":"Qwen2.5-Coder-3B-Instruct-Q4_K_M",
  "language":"",
  "translate":false,
  "n":0,
  "top_p":0.9,
  "top_k":null,
  "temperature":0.2,
  "max_tokens":100,
  "echo":false,
  "batch":0,
  "ignore_eos":false,
  "repeat_penalty":0,
  "repeat_last_n":0,
  "n_keep":0,
  "frequency_penalty":0,
  "presence_penalty":0,
  "tfz":null,
  "typical_p":null,
  "seed":null,
  "negative_prompt":"",
  "rope_freq_base":0,
  "rope_freq_scale":0,
  "negative_prompt_scale":0,
  "use_fast_tokenizer":false,
  "clip_skip":0,
  "tokenizer":"",
  "file":"",
  "size":"",
  "prompt":null,
  "instruction":"",
  "input":null,
  "stop":null,
  "messages":[
     {
        "role":"user",
        "content":"55 + 55 =?"
     }
  ],
  "functions":[
     {
        "name":"add",
        "description":"Add two numbers",
        "strict":false,
        "parameters":{
           "properties":{
              "a":{
                 "description":"first parameter for add",
                 "type":"number"
              },
              "b":{
                 "description":"second parameter for add",
                 "type":"number"
              }
           },
           "required":[
              "a",
              "b"
           ],
           "type":"object"
        }
     },
     {
        "name":"multiply",
        "description":"Multiply two numbers",
        "strict":false,
        "parameters":{
           "properties":{
              "a":{
                 "description":"first parameter for multiply",
                 "type":"number"
              },
              "b":{
                 "description":"second parameter for multiply",
                 "type":"number"
              }
           },
           "required":[
              "a",
              "b"
           ],
           "type":"object"
        }
     },
     {
        "name":"subtract",
        "description":"Subtract two numbers",
        "strict":false,
        "parameters":{
           "properties":{
              "a":{
                 "description":"first parameter for subtract",
                 "type":"number"
              },
              "b":{
                 "description":"second parameter for subtract",
                 "type":"number"
              }
           },
           "required":[
              "a",
              "b"
           ],
           "type":"object"
        }
     },
     {
        "name":"divide",
        "description":"Divide two numbers",
        "strict":false,
        "parameters":{
           "properties":{
              "a":{
                 "description":"first parameter for divide",
                 "type":"number"
              },
              "b":{
                 "description":"second parameter for divide",
                 "type":"number"
              }
           },
           "required":[
              "a",
              "b"
           ],
           "type":"object"
        }
     }
  ],
  "function_call":null,
  "stream":false,
  "mode":0,
  "step":0,
  "grammar":"",
  "grammar_json_functions":null,
  "backend":"",
  "model_base_name":""
}

Response: {"created":1735285317,"object":"chat.completion","id":"c55e5c22-3d71-4f74-bcd4-19704ba6b620","model":"Qwen2.5-Coder-3B-Instruct-Q4_K_M","choices":[{"index":0,"finish_reason":"function_call","message":{"role":"assistant","content":"","function_call":{"arguments":"{"a": 55, "b": 55}","name":"add"}}}],"usage":{"prompt_tokens":468,"completion_tokens":28,"total_tokens":496}}

Dec 27 '24 07:12 neslihansarigul

I can confirm the same problem with gemma-3-27b-it-qat, and Qwen3 models (30b and 32b).

More precisely, this is always a problem with storing things (e.g. add item to shopping cart, store something in graph database, etc.). Retrieving information works well (e.g. reading shopping list items, reading graph database). For example, when I say "add cheese to the shopping cart", it will add like 10 instances of cheese. But when I say "retrieve items from shopping cart", it will call one function, and then give the answer.

Jun 07 '25 07:06 Fr0d0Beutl1n

I raise my doubts that it's coming from tool_choice. I cut the whole function definition when responding to the tool_call and also do not append tool_choice when sending a second call. Still it did loop for me too when I asked to summarize a text and it decided to call a websearch for "how to summarize a text". It started looping until I cut it off.

What I did was the typical approach explained in openAI documentation:

I sent the prompt with tool definitions and tool_choice auto.
Then I retrieved the tool call from the response, parsed the id and arguments
After doing the websearch on my code I sent back another request with the conversation history and the tool call information along with the id and the response gathered from the websearch. No tool definitions are sent back to the ai thus it did seem to read what it sent before and build another tool call from it.

I think for some reason it does interpret the old user message as another tool call instead of looking for the tool call id and result.

I've been using multiple models, gemma3-12b-it is my main model and the issue occured there.

Jun 13 '25 09:06 johndev168

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days.

Sep 16 '25 02:09 github-actions[bot]

This issue was closed because it has been stalled for 5 days with no activity.

Sep 24 '25 02:09 github-actions[bot]