Phi-3CookBook icon indicating copy to clipboard operation
Phi-3CookBook copied to clipboard

Please provide examples for function calling (aka "tools")

Open stippi2 opened this issue 9 months ago • 5 comments

Thanks for making this cook book available and Phi-3 available!

In short:

  1. I would love an example of a system message that declares some tools with OpenAPI function definitions, and explains how to invoke them.
  2. How are function results even returned to the model? There seem to be no dedicated roles for that. (Why would the model realize that it needs to compile the results of a function call into a coherent message to the actual user?)

For a bit more context:

I would like to use Phi-3 in an application with function calling. In my Phi-3 evaluation, I struggle to come up with a system message that successfully explains function calling to the model. I have Phi-3 running on LM Studio on a PC and Ollama on a Mac. When I craft a system message that declares tools and instructs the model how to invoke them, the best I could achieve was that the model replied along the lines of "If I wouldn't run in a simulation, I would call the tool X with the parameters Y to answer this request." Instead of just doing it. In many cases, I get large blobs of text after the initial reply that look like training data. The "training data" looks very similar to what I am trying to do with my system message. Sometimes it contains sections that I didn't think of and are not just a rephrase of my attempt at a system prompt.

I've also worked with LangChain and the experimental Ollama function calling package. The system message it generates is very simple. As long as there is a single tool defined, invoking it works reliable. But when multiple tools are defined, and the model needs to decide whether it even needs a tool at all, the generated output becomes very unreliable, often with rambling appended that looks like training data. Or it generates a function call, but continous with halucinated results for the call, instead of waiting for them in the next message.

stippi2 avatar May 22 '24 15:05 stippi2

Hi @stippi2

Can I play the following back to ensure I understand what your asking?

Example: System message to demonstrate how you can declare tools using OpenAPI and instruct your Phi-3 model on how to invoke these functions effectively.

We will define two simple tools: greet which returns a greeting. Our goal is to provide clear instructions for the model so it understands when and how to use these functions in responses.

Here's an example of what your OpenAPI definition might look like:

openapi: "3.0.0"
info:
  title: Tool Interactions API
  version: "1.0.0"
servers:
- url: http://myapp.com/docs/swagger#!/tools
paths:
  /greet/{name}:
    get:
      summary: Greet a person by name
      description: This tool generates a greeting message for the provided name parameter.
      responses:
        200:
          description: Successful response containing a greeting message.
          content:
            application/json:
              schema:
                type: object
                properties:
                  message:
                    type: string
exampleParameters:
  name: "Hello"
      $ref: '#/paths/{name}/get'

Now, let's craft a system message to guide the Phi-3 model on how it should handle this API request and invoke the greet tool.

System Message:

[START]
{
  "api": {
    "type": "tool",
    "name": "greet",
    "parameters": [
      {
        "key": "name",
        "value": "Hello"
      }
    ],
    "responseFormat": "application/json",
    "functionCall": true,
    "functionName": "invoke_tool",
    "dataType": "object",
    "expectedResponseSchema": {
      "type": "object",
      "properties": {
        "message": {
          "type": "string"
        }
      },
      "required": ["message"]
    }
  }
}
[END]

In this system message, we're informing the Phi-3 model that it should invoke a tool called greet with the parameter "Hello". We also specify the expected response format and schema using OpenAPI definitions. The [START] and [END] tokens mark the beginning and end of the instruction block for clarity.

Phi-3, LM Studio and Ollama should be able to understand this message based on their understanding of your APIs' OpenAPI specifications. When you invoke Phi-3 using the provided system message:

invoke_tool with name = Hello

Phi-3 will correctly call the greet tool, and it is expected that it would return a response like this in JSON format (assuming there's an implementation for greeting):

{
  "message": "Hello, nice to meet you!"
}

Phi-3 should then produce the corresponding human-readable message based on your Phi model training:

Hi! I got a request using the `greet` tool with name = Hello. The response from the tool is: "Hello, nice to meet you!".

Remember that you might need additional setup for your applications (e.g., API server configuration) or adjustments in Phi-3's training and model logic if you run into issues when invoking functions.

leestott avatar May 22 '24 16:05 leestott

Thanks for responding so quickly! The mention of "OpenAPI" was completely misleading as I confused it with JSON Schema reference. Sorry about that.

In my existing application, I am using the OpenAI API and declare a tools array. I would like to evaluate Phi-3 as an alternative model to use in this application. The runtimes I can try all have no native support for tools. So I am trying to craft my own system message for explaining the available tools to the model.

I understand this is the prompt format that is fed into the Phi-3 model:

<|system|>You are a helpful AI assistant.<|end|><|user|>Can you introduce yourself?<|end|><|assistant|>

Your example of a system message has me confused:

  1. What is the meaning of the [START] and [END] tags?
  2. Why is nothing explained to the model?

Given this array of tools in OpenAI tools format:

[
  {
    "type": "function",
    "function": {
      "name": "add_alarm",
      "description": "Add an alarm to the active timers. Displayed as an alarm for the given time.",
      "parameters": {
        "type": "object",
        "properties": {
          "time": {
            "type": "string",
            "description": "The exact time when the timer should go off, in the format 'YYYY-MM-DD HH:MM:SS'."
          },
          "title": {
            "type": "string",
            "description": "Optional title of the timer."
          }
        },
        "required": ["time"]
      }
    }
  },
  {
    "type": "function",
    "function": {
      "name": "add_countdown",
      "description": "Add a countdown timer to the active timers. Displayed as counting down to zero.",
      "parameters": {
        "type": "object",
        "properties": {
          "duration": {
            "type": "string",
            "description": "A duration in ISO 8601 format."
          },
          "title": {
            "type": "string",
            "description": "Optional title of the timer."
          }
        },
        "required": ["duration"]
      }
    }
  },
  {
    "type": "function",
    "function": {
      "name": "delete_timer",
      "description": "Cancel one of the active timers",
      "parameters": {
        "type": "object",
        "properties": {
          "id": {
            "type": "string"
          }
        },
        "required": ["id"]
      }
    }
  },
  {
    "type": "function",
    "function": {
      "name": "evaluate_expression",
      "description": "Evaluate a mathematical expression in the mathjs syntax",
      "parameters": {
        "type": "object",
        "properties": {
          "expression": {
            "type": "string"
          }
        },
        "required": ["expression"]
      }
    }
  },
  {
    "type": "function",
    "function": {
      "name": "get_current_weather",
      "description": "Get the current weather in a given location",
      "parameters": {
        "type": "object",
        "properties": {
          "latitude": {
            "type": "number"
          },
          "longitude": {
            "type": "number"
          }
        },
        "required": ["latitude", "longitude"]
      }
    }
  },
  {
    "type": "function",
    "function": {
      "name": "get_weather_forecast",
      "description": "Get the weather forecast for 5 days with data every 3 hours in a given location",
      "parameters": {
        "type": "object",
        "properties": {
          "latitude": {
            "type": "number"
          },
          "longitude": {
            "type": "number"
          }
        },
        "required": ["latitude", "longitude"]
      }
    }
  },
  {
    "type": "function",
    "function": {
      "name": "get_places_info",
      "description": "Get information about nearby places using the Google Places API",
      "parameters": {
        "type": "object",
        "properties": {
          "latitude": {
            "type": "number"
          },
          "longitude": {
            "type": "number"
          },
          "radius": {
            "type": "number",
            "description": "The radius in meters around the given location"
          },
          "query": {
            "type": "string",
            "description": "A text query like the name of a nearby place"
          },
          "fields": {
            "type": "array",
            "items": {
              "type": "string"
            },
            "description": "A list of fields to retrieve for each place. Available fields are 'formattedAddress', 'regularOpeningHours', 'currentOpeningHours', 'types', 'rating' and 'websiteUri'"
          },
          "maxResults": {
            "type": "number"
          }
        },
        "required": ["latitude", "longitude", "query", "fields"]
      }
    }
  },
  {
    "type": "function",
    "function": {
      "name": "play_on_spotify",
      "description": "Start playing tracks, an album, artist or a playlist on Spotify. Calling this function replaces the current playlist!",
      "parameters": {
        "type": "object",
        "properties": {
          "trackIds": {
            "type": "array",
            "items": {
              "type": "string"
            },
            "description": "Optional. An array of track IDs"
          },
          "contextUri": {
            "type": "string",
            "description": "Optional. The Spotify URI of an album, artist, or playlist."
          }
        },
        "required": []
      }
    }
  },
  {
    "type": "function",
    "function": {
      "name": "find_artists_and_play_top_songs_on_spotify",
      "description": "Searches for 'queries' on Spotify and plays top songs of the found artist(s). Calling this function replaces the current playlist! Pass multiple artists to one tool invocation to play a mix of top songs from different artists.",
      "parameters": {
        "type": "object",
        "properties": {
          "queries": {
            "type": "array",
            "items": {
              "type": "string"
            },
            "description": "One or more queries to find artists by."
          }
        },
        "required": ["queries"]
      }
    }
  },
  {
    "type": "function",
    "function": {
      "name": "find_on_spotify",
      "description": "Find tracks, artists, albums or playlists on Spotify",
      "parameters": {
        "type": "object",
        "properties": {
          "query": {
            "type": "string"
          },
          "types": {
            "type": "array",
            "items": {
              "type": "string"
            },
            "description": "Types to search across. Valid types are: 'track', 'artist', 'album', 'playlist', 'show', and 'episode'."
          },
          "limit": {
            "type": "integer",
            "description": "The maximum number of items to return"
          }
        },
        "required": ["query", "types"]
      }
    }
  }
]

My actual application contains a lot more tools, and I realize that Phi-3 may be overwhelmed and I have to reduce the number of tools. But I give this longer list intentionally so you can see that the tools would be diverse and the model needs to understand that only one tool is relevant or perhaps no tool at all.

What would be the final prompt where these tools are embedded into a system message that Phi-3 has been trained on? How is a result from a tool appended to the prompt, such that the model understands it needs to phrase that result into a message for the user?

stippi2 avatar May 22 '24 16:05 stippi2

@stippi2 Ok so this is purely down to system message or metaprompt

Creating a final prompt for Phi-3 with embedded tools involves structuring commands and responses in a way that clearly indicates when a tool function is being invoked.

The system should parse these invocations, execute the corresponding action (tool usage), and then seamlessly integrate the results back into user interactions.

Here's an example of how this can be achieved with a custom metaprompt/system message: see https://learn.microsoft.com/azure/ai-services/openai/concepts/system-message

System message for Tools Embedded

You are a virtual assistant capable of handling various tasks through a series of tools designed to interact with different APIs and systems. Based on the user's input, identify which tool function should be used and execute it accordingly. Below is an overview of available tools:

1. `get_weather_forecast`: Retrieves weather forecast for 5 days in a given location (latitude, longitude).
2. `find_on_spotify`: Finds tracks, artists, albums or playlists on Spotify using queries and types.
3. ... [rest of the tools as listed] ...

To use a tool function:
- Specify the name of the tool followed by its parameters within backticks (`). For example, `get_weather_forecast({latitude: 40.7128, longitude: -74.0060})`.
- Ensure that you provide all required parameters in a valid JSON format if needed.

Upon receiving user input or task requirements, determine the appropriate tool to use and execute it. Integrate the results from tools into your responses to the users as follows:

1. If a tool function successfully executes and returns data (e.g., weather forecast), append its result directly in the response message using clear formatting (e.g., "The 5-day weather forecast for New York City is:").
2. When invoking tools with parameters, ensure you describe the task context clearly to maintain consistency and understanding. For example: "Based on your location at [latitude, longitude], here's today's weather."
3. If multiple tool functions could be relevant or no specific tool is identified for a given request, carefully choose one based on user needs, or suggest alternatives if necessary.

Here are some examples of how to format interactions:

Example 1 (Weather Forecast):
- User Input: "What's the weather like in Tokyo today?"
- Assistant Response: "Based on your location at [Tokyo coordinates], here's today's weather forecast."

Example Tool Invocation: `get_weather_forecast({latitude: 35.6895, longitude: 139.6917})`
- Assistant Response Integration: "Based on your location in Tokyo (35.6895, 139.6917), the weather forecast for today includes a high of 24°C and no precipitation."

Example 2 (Spotify Search):
- User Input: "Find me some popular tracks by Ed Sheeran"
- Assistant Response Invoking Tool: `find_on_spotify({query: 'Ed Sheeran', types: ['artist']})`
- Assistant Response Integration: "Based on your request for Ed Sheeran, here are the top 5 popular tracks by him."

Remember to maintain a friendly and engaging tone in all interactions while providing accurate information. Continue learning from user feedback to improve responses over time.

This prompt format provides clear instructions for tool usage and response integration, ensuring that Phi-3 can effectively utilize the tools within its capabilities without getting overwhelmed.

leestott avatar May 22 '24 16:05 leestott

Thanks a lot for that! Hope it's OK, I would like to clarify how the prompt towards the model evolves during the turns of the conversation with the user with tool invokations.

Assume your long system message is stored in a variable named systemPrompt. Somewhere in the system prompt, I would integrate a section about available real-time data, like the users current location with longitude and latitude and the current time and date, for example.

The initial prompt could be (line-breaks for clarity):

<|system|>${systemPrompt}<|end|>
<|user|>Hi, what time is it?<|end|>
<|assistant|>

Since the time can be found in the system prompt, I would expect Phi-3 to reply with something like:

It's half past ten.

The user could follow up with a question about the weather:

<|system|>${systemPrompt}<|end|>
<|user|>Hi, what time is it?<|end|>
<|assistant|>It's half past ten.<|end|>
<|user|>And what's the weather like?<|end|>
<|assistant|>

From your explanation above, I would expect the assistant replies with:

`get_current_weather({latitude: 35.6895, longitude: 139.6917})`

In my app, I detect the tool invokation, make the API call to the weather service and now I want to forward the JSON response to the model. How would the prompt now look like? Do I need to use the "user" role?

<|system|>${systemPrompt}<|end|>
<|user|>Hi, what time is it?<|end|>
<|assistant|>It's half past ten.<|end|>
<|user|>And what's the weather like?<|end|>
<|assistant|>`get_current_weather({latitude: 35.6895, longitude: 139.6917})`<|end|>
<|user|>???<|end|>
<|assistant|>

What do I insert for the ??? question marks? Do I indeed use the "user" role? If so, why does the model know it needs to rephrase that, if the user seemingly already has this information? Is the above concatenation of the message turns into the prompt even how it works at all?

Thanks!

stippi2 avatar May 23 '24 07:05 stippi2

I will upgrade the content to talk about this in next few weeks , thanks

kinfey avatar May 23 '24 09:05 kinfey