llama-cpp-python icon indicating copy to clipboard operation
llama-cpp-python copied to clipboard

chatml-function-calling chat format fails to generate multi calls to the same tool

Open jeffmaury opened this issue 1 year ago • 1 comments

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • [x] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • [x] I carefully followed the README.md.
  • [x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • [x] I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

The response should return multiple calls to the tools function

Current Behavior

A single call is generated.

Environment and Context

Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.

  • Physical (or virtual) hardware you are using, e.g. for Linux: Win11Pro on laptop

Failure Information (for bugs)

Please help provide information about the failure if this is a bug. If it is not a bug, please remove the rest of this template.

Steps to Reproduce

Please provide detailed steps for reproducing the issue. We are not sitting in front of your screen, so the more detail the better.

Download model file from https://huggingface.co/MaziyarPanahi/Mistral-7B-Instruct-v0.3-GGUF/resolve/main/Mistral-7B-Instruct-v0.3.Q4_K_M.gguf


import os
from llama_cpp.llama import Llama

llm = Llama(
    model_path="Mistral-7B-Instruct-v0.3.Q4_K_M.gguf", chat_format='chatml-function-calling'

)

SYSTEM_MESSAGE="""
You are a helpful assistant.
You can call functions with appropriate input when necessary.
You can call the same function several times.
"""

weather = {
  "type": "function",
  "function": {
    "name": "get_current_weather",
    "description": "Get the current weather in a given latitude and longitude",
    "parameters": {
      "type": "object",
      "properties": {
        "latitude": {
          "type": "number",
          "description": "The latitude of a place",
        },
        "longitude": {
          "type": "number",
          "description": "The longitude of a place",
        },
      },
      "required": ["latitude", "longitude"],
    },
  },
}

question = "What's the weather like in the following cities: Sydney and Paris ?"
messages = [
  {"role": "system", "content": SYSTEM_MESSAGE},
  {"role": "user", "content": question}
]
response = llm.create_chat_completion_openai_v1(messages, tools=[weather],tool_choice='auto')
print(response)

Failure Logs

N/A

jeffmaury avatar Sep 23 '24 20:09 jeffmaury

See #1503, although it was done mainly for built-in chat templates it should also work for the chatml-function-calling format, albeit currently only when selecting the tool specifically, not with auto, though I'm sure that can be made to work as well.

CISC avatar Sep 24 '24 12:09 CISC