llm Ability to "reply" to a tool-response with a prompt carrying those tool results

Part of tools, #898

Apr 20 '25 03:04 simonw

This may be the harder design problem (than #935 and #936). The way these are represented in different LLM APIs may differ quite a bit. Let's figure that out:

Anthropic's example looks like this. Note the presence of a toolu_01A09q90qw90lq917835lq9 ID in the tool request which is later reflected in the tool_use_id of the tool_result message:

{
    "model": "claude-3-7-sonnet-20250219",
    "max_tokens": 1024,
    "tools": [
        {
            "name": "get_weather",
            "description": "Get the current weather in a given location",
            "input_schema": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA"
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "The unit of temperature, either \"celsius\" or \"fahrenheit\""
                    }
                },
                "required": ["location"]
            }
        }
    ],
    "messages": [
        {
            "role": "user",
            "content": "What is the weather like in San Francisco?"
        },
        {
            "role": "assistant",
            "content": [
                {
                    "type": "text",
                    "text": "<thinking>I need to use get_weather, and the user wants SF, which is likely San Francisco, CA.</thinking>"
                },
                {
                    "type": "tool_use",
                    "id": "toolu_01A09q90qw90lq917835lq9",
                    "name": "get_weather",
                    "input": {
                        "location": "San Francisco, CA",
                        "unit": "celsius"
                    }
                }
            ]
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "tool_result",
                    "tool_use_id": "toolu_01A09q90qw90lq917835lq9",
                    "content": "15 degrees"
                }
            ]
        }
    ]
}

OpenAI's docs don't include a full JSON example, but there's this Python code from https://platform.openai.com/docs/guides/function-calling?api-mode=responses&lang=python#function-calling-steps

input_messages.append(tool_call)  # append model's function call message
input_messages.append({                               # append result message
    "type": "function_call_output",
    "call_id": tool_call.call_id,
    "output": str(result)
})

response_2 = client.responses.create(
    model="gpt-4.1",
    input=input_messages,
    tools=tools,
)
print(response_2.output_text)

I just realized that the Chat Completion and Responses APIs may differ here. I think I'll implement this just as Responses in the https://github.com/simonw/llm-openai-plugin plugin.

Apr 20 '25 03:04 simonw

Here's a Gemini example: https://ai.google.dev/gemini-api/docs/function-calling?example=meeting#step_4_create_user_friendly_response_with_function_result_and_call_the_model_again

// Create a function response part
const function_response_part = {
  name: tool_call.name,
  response: { result }
}

// Append function call and result of the function execution to contents
contents.push({ role: 'model', parts: [{ functionCall: tool_call }] });
contents.push({ role: 'user', parts: [{ functionResponse: function_response_part }] });

// Get the final response from the model
const final_response = await ai.models.generateContent({
  model: 'gemini-2.0-flash',
  contents: contents,
  config: config
});

console.log(final_response.text);

Apr 20 '25 03:04 simonw

The challenge of matching tool call IDs to tool response IDs may become a lot easier if I implement this design change first:

https://github.com/simonw/llm/issues/938#issuecomment-2816986647

Apr 20 '25 04:04 simonw

I've got most of the pre-requisites for this in the tools/ branch now: https://github.com/simonw/llm/commits/f8cd7be60097161da1968335ba78e3e3942899a3/

Here's where I'm at:

import llm
model = llm.get_model("gpt-4.1-mini")

def get_weather(city: str) -> str:
    """Get the weather for a given city."""
    return f"The weather in {city} is fine."

response = model.prompt(
    "Weather in San Francisco?",
    tools=[llm.Tool.function(get_weather)]
)
response.tool_calls()

Output:

[ToolCall(name='get_weather', arguments={'city': 'San Francisco'})]

May 10 '25 18:05 simonw

Problems to solve:

Executing the functions. I realize now that I forgot to stash the actual function in prompt.tools - so right now we don't have a useful way to turn that get_weather name into a reference to what we need to execute.
Where in the code does the execution happen?
Different models have different ways of sending replies. An abstraction for those?
Is this a place where we want response.reply(tool_results) which then sparks a new prompt? How does our llm.Conversation abstraction hear about those?

May 10 '25 18:05 simonw

A Response has an optional .conversation property referencing a Conversation or AsyncConversation.

I believe this is None for prompts that started directly using model.prompt(...).

On that basis, I think this mechanism needs to work independently of conversations - but should append to them if a conversation is in play.

May 10 '25 19:05 simonw

At some point we will want code that executes tools in a loop - so you pass in a prompt with some tools and then keep on executing those tool responses until the model either finishes or it hits a loop limit (maybe default to ten but allow users to set it to None for no limit).

On top of this we will want to build UIs that stream tokens and then show what tools were executed and then stream more tokens.

May 10 '25 19:05 simonw

I need an abstraction like ToolCall for a result that then gets sent to the model - I'm going to create something called ToolResult. Needs to handle:

Anthropic:

{
    "type": "tool_result",
    "tool_use_id": "toolu_01A09q90qw90lq917835lq9",
    "content": "15 degrees"
}

OpenAI:

[
  {
    "tool_call_id": "call_001",
    "output": "70 degrees and sunny."
  }
]

Gemini: https://ai.google.dev/api/caching#FunctionResponse - also had o4-mini-high try to figure out the curl pattern since the docs only covered Python and JavaScript: https://chatgpt.com/share/681fa97d-2214-8006-a073-d1be126109bd

{
  "id": string,
  "name": string,
  "response": {
    object
  }
}

Gemini id docs say:

Optional. The id of the function call this response is for. Populated by the client to match the corresponding function call id.

May 10 '25 19:05 simonw

https://github.com/simonw/llm/blob/614941dbe5f4ef56ba4ca2ef4b9321c163ca301e/llm/models.py#L174-L178

May 10 '25 19:05 simonw

I guess Prompt is going to grow a tool_results: List[ToolResult] property then.

May 10 '25 19:05 simonw

I decided to use the verb chain() for this - for the thing where you end up with a chain of prompts and responses due to tool calls is the middle.

I was a tiny bit nervous about the overlap with LangChain's use of the term Chain, but they are using it as a top-level noun where I'm using it for a verb so I think I can get away without confusing things too much. A LangChain chain is more of a DAG workflow. My thing is restricted to prompts-tools-prompts sequences.

I tried adding a model.chain() method but then realized that this stuff needs to take place in a Conversation in order to be able to take advantage of the code I've written in the past for building message arrays and sending over the full history of the conversation. So I prototyped a conversation.chain(...) method and it seems to work!

About to push the first prototype of that.

May 12 '25 00:05 simonw

git diff | llm -s 'describe change'

The change introduces support for chaining multiple LLM responses, especially to enable tool calls to be executed in sequence within a conversation. The key modifications are:

In llm/default_plugins/openai_models.py:

Added a tool_call_id field to the ToolCall creation to track individual tool calls.

In llm/models.py:

Extended the ToolCall data class to include an optional tool_call_id field.

Added a new method chain to the Conversation class that prepares a ChainResponse object. This method accepts prompt parameters, tools, tool results, and other options to initiate chained interactions.

Defined a new class _BaseChainResponse which handles iterating over responses and executing any returned tool calls automatically:

It runs the initial prompt.

For each response, it checks for tool calls, executes them using the provided tool implementations, collects their outputs, and sends the results back to the model in a new prompt.

This process repeats until no more tool calls are present or a chain limit is reached.

It supports both synchronous iteration of messages and obtaining the full aggregated text output.

Introduced ChainResponse as a subclass of _BaseChainResponse for clarity and future extensibility.

Overall, this change adds orchestration logic to enable multi-step chains of LLM responses interleaved with tool executions, enhancing the conversation model's ability to handle complex workflows involving external tools.

May 12 '25 00:05 simonw

Against my llm-ollama fork:

import llm
model = llm.get_model("qwen3:4b")

def get_weather(city: str) -> str:
    """Get the weather for a given city."""
    return f"The weather in {city} is fine."

for s in model.conversation().chain("Weather in San Fran", tools=[llm.Tool.function(get_weather)]):
    print(s, end="", flush=True)

Output:

<think>
Okay, the user asked for the weather in San Francisco. I called the get_weather function with the city parameter set to "San Fran". The response from the tool said the weather is fine. Now I need to present this information back to the user in a clear and friendly way. Let me make sure to mention the city and the weather condition. Maybe add a sentence like "The weather in San Fran is fine." That should cover it. I should check if there's any additional info needed, but since the tool response is straightforward, this should be sufficient.
</think>

The weather in San Fran is fine.

May 12 '25 00:05 simonw

Tried with OpenAI and it went into an infinite loop, had to implement my chain limit to see that.

LLM_OPENAI_SHOW_RESPONSES=1 python

Then:

import llm
model = llm.get_model("gpt-4.1-mini")

def get_weather(city: str) -> str:
    """Get the weather for a given city."""
    return f"The weather in {city} is fine."

for s in model.conversation().chain("Weather in San Fran", tools=[llm.Tool.function(get_weather)]):
    print(s, end="", flush=True)

It was looping because I forgot to implement the bit in the OpenAI plugin where the tool results are sent back to OpenAI.

May 12 '25 00:05 simonw

Oops, I implemented Anthropic when I should have been implementing OpenAI. Anthropic code in build_messages() looks something like this:

        if prompt.tool_results:
            messages.append(
                {
                    "role": "user",
                    "content": [
                        {
                            "type": "tool_result",
                            "tool_use_id": tool_result.tool_call_id,
                            "content": tool_result.output,
                        }
                        for tool_result in prompt.tool_results
                    ],
                }
            )

May 12 '25 00:05 simonw

... and then I implemented the OpenAI Responses API:

for tool_result in prompt.tool_results:
    messages.append({
        "type": "function_call_output",
        "call_id": tool_result.tool_call_id,
        "output": tool_result.output,
    })

But I should have implemented the Chat API - https://platform.openai.com/docs/guides/function-calling?api-mode=chat&lang=python#function-calling-steps

May 12 '25 00:05 simonw

Got to this error:

openai.BadRequestError: Error code: 400 - {'error': {'message': "Invalid parameter: messages with role 'tool' must be a response to a preceeding message with 'tool_calls'.", 'type': 'invalid_request_error', 'param': 'messages.[3].role', 'code': None}}

So I need to ensure older conversations have their tool calls copied in properly.

May 12 '25 00:05 simonw

Got a demo script working like this (upgraded from a bad LLM-generated script):

import os
import json
from openai import OpenAI
import openai

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))


# 3️⃣ Dummy tool implementation
def get_weather(location: str) -> dict:
    """
    A stand-in for a real weather API. Returns
    consistent dummy data for demonstration.
    """
    return {
        "location": location,
        "temperature": "20°C",
        "description": "Sunny with light breeze",
    }


# 4️⃣ First pass: ask the model, let it decide to call get_weather
response = client.chat.completions.create(
    model="gpt-4.1-mini",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Weather in San Francisco?"},
    ],
    functions=[
        {
            "name": "get_weather",
            "description": "Get the current weather in a given location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city to get the weather for",
                    }
                },
                "required": ["location"],
            },
        }
    ],
    function_call="auto",
)  # let the model choose to call get_weather)

message = response.choices[0].message

print(message)

# 5️⃣ If the model wants to call our function, execute it…
if message.function_call:
    args = json.loads(message.function_call.arguments)
    weather_data = get_weather(**args)

    # 6️⃣ …then send the function’s result back into the conversation
    follow_up = client.chat.completions.create(
        model="gpt-4.1-mini",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Weather in San Francisco?"},
            message,  # includes the function_call from the assistant
            {
                "role": "function",
                "name": "get_weather",
                "content": json.dumps(weather_data),
            },
        ],
    )

    # 7️⃣ Finally, display the assistant’s answer
    print(follow_up.choices[0].message.content)
else:
    # If no function was called, just print the text
    print(message.content)

Run like this

OPENAI_LOG=debug uv run --with openai python demo.py

Gave me this:

[2025-05-11 17:46:36 - openai._base_client:453 - DEBUG] Request options: {'method': 'post', 'url': '/chat/completions', 'files': None, 'json_data': {'messages': [{'role': 'system', 'content': 'You are a helpful assistant.'}, {'role': 'user', 'content': 'Weather in San Francisco?'}], 'model': 'gpt-4.1-mini', 'function_call': 'auto', 'functions': [{'name': 'get_weather', 'description': 'Get the current weather in a given location', 'parameters': {'type': 'object', 'properties': {'location': {'type': 'string', 'description': 'The city to get the weather for'}}, 'required': ['location']}}]}}
[2025-05-11 17:46:36 - openai._base_client:952 - DEBUG] Sending HTTP Request: POST https://api.openai.com/v1/chat/completions
[2025-05-11 17:46:37 - httpx:1025 - INFO] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
[2025-05-11 17:46:37 - openai._base_client:991 - DEBUG] HTTP Response: POST https://api.openai.com/v1/chat/completions "200 OK" Headers([('date', 'Mon, 12 May 2025 00:46:37 GMT'), ('content-type', 'application/json'), ('transfer-encoding', 'chunked'), ('connection', 'keep-alive'), ('access-control-expose-headers', 'X-Request-ID'), ('openai-organization', 'user-r3e61fpak04cbaokp5buoae4'), ('openai-processing-ms', '713'), ('openai-version', '2020-10-01'), ('x-envoy-upstream-service-time', '717'), ('x-ratelimit-limit-requests', '30000'), ('x-ratelimit-limit-tokens', '150000000'), ('x-ratelimit-remaining-requests', '29999'), ('x-ratelimit-remaining-tokens', '149999982'), ('x-ratelimit-reset-requests', '2ms'), ('x-ratelimit-reset-tokens', '0s'), ('x-request-id', 'req_640fda1fe641188c6c85e99d7b6df841'), ('strict-transport-security', 'max-age=31536000; includeSubDomains; preload'), ('cf-cache-status', 'DYNAMIC'), ('set-cookie', '__cf_bm=....Chs; path=/; expires=Mon, 12-May-25 01:16:37 GMT; domain=.api.openai.com; HttpOnly; Secure; SameSite=None'), ('x-content-type-options', 'nosniff'), ('set-cookie', '_cfuvid=mH5IuTS4yFwmGbUDICggJHqzC9.YPRCMILUpIIWWJeM-1747010797231-0.0.1.1-604800000; path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None'), ('server', 'cloudflare'), ('cf-ray', '93e5e66579477c23-LAX'), ('content-encoding', 'br'), ('alt-svc', 'h3=":443"; ma=86400')])
[2025-05-11 17:46:37 - openai._base_client:999 - DEBUG] request_id: req_640fda1fe641188c6c85e99d7b6df841
ChatCompletionMessage(content=None, refusal=None, role='assistant', annotations=[], audio=None, function_call=FunctionCall(arguments='{"location":"San Francisco"}', name='get_weather'), tool_calls=None)
[2025-05-11 17:46:37 - openai._base_client:453 - DEBUG] Request options: {'method': 'post', 'url': '/chat/completions', 'files': None, 'json_data': {'messages': [{'role': 'system', 'content': 'You are a helpful assistant.'}, {'role': 'user', 'content': 'Weather in San Francisco?'}, {'content': None, 'refusal': None, 'role': 'assistant', 'annotations': [], 'function_call': {'arguments': '{"location":"San Francisco"}', 'name': 'get_weather'}}, {'role': 'function', 'name': 'get_weather', 'content': '{"location": "San Francisco", "temperature": "20\\u00b0C", "description": "Sunny with light breeze"}'}], 'model': 'gpt-4.1-mini'}}
[2025-05-11 17:46:37 - openai._base_client:952 - DEBUG] Sending HTTP Request: POST https://api.openai.com/v1/chat/completions
[2025-05-11 17:46:38 - httpx:1025 - INFO] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
[2025-05-11 17:46:38 - openai._base_client:991 - DEBUG] HTTP Response: POST https://api.openai.com/v1/chat/completions "200 OK" Headers({'date': 'Mon, 12 May 2025 00:46:38 GMT', 'content-type': 'application/json', 'transfer-encoding': 'chunked', 'connection': 'keep-alive', 'access-control-expose-headers': 'X-Request-ID', 'openai-organization': 'user-r3e61fpak04cbaokp5buoae4', 'openai-processing-ms': '887', 'openai-version': '2020-10-01', 'x-envoy-upstream-service-time': '893', 'x-ratelimit-limit-requests': '30000', 'x-ratelimit-limit-tokens': '150000000', 'x-ratelimit-remaining-requests': '29999', 'x-ratelimit-remaining-tokens': '149999957', 'x-ratelimit-reset-requests': '2ms', 'x-ratelimit-reset-tokens': '0s', 'x-request-id': 'req_77d2e9f133161ff4c1fc0a36de8af3ae', 'strict-transport-security': 'max-age=31536000; includeSubDomains; preload', 'cf-cache-status': 'DYNAMIC', 'x-content-type-options': 'nosniff', 'server': 'cloudflare', 'cf-ray': '93e5e66b7e7e7c23-LAX', 'content-encoding': 'br', 'alt-svc': 'h3=":443"; ma=86400'})
[2025-05-11 17:46:38 - openai._base_client:999 - DEBUG] request_id: req_77d2e9f133161ff4c1fc0a36de8af3ae
The current weather in San Francisco is sunny with a light breeze, and the temperature is around 20°C.

May 12 '25 00:05 simonw

Here's the JSON body I needed to see:

{
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Weather in San Francisco?"
    },
    {
      "content": null,
      "refusal": null,
      "role": "assistant",
      "annotations": [],
      "function_call": {
        "arguments": "{\"location\":\"San Francisco\"}",
        "name": "get_weather"
      }
    },
    {
      "role": "function",
      "name": "get_weather",
      "content": "{\"location\": \"San Francisco\", \"temperature\": \"20\\u00b0C\", \"description\": \"Sunny with light breeze\"}"
    }
  ],
  "model": "gpt-4.1-mini"
}

May 12 '25 00:05 simonw

So to get my code working I need to add that assistant message with the previous function call.

May 12 '25 00:05 simonw

Actually found docs here: https://platform.openai.com/docs/api-reference/chat/create#chat-create-messages

So I need to do this:

{
  "role": "assistant",
  "tool_calls": [{
    "function": , "id": , "type": "function"}]
}

May 12 '25 00:05 simonw

Got this error:

openai.BadRequestError: Error code: 400 - {'error': {'message': "An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'. The following tool_call_ids did not have response messages: call_f1JCjCgvNcpzPBxG8UGGikWu", 'type': 'invalid_request_error', 'param': 'messages.[3].role', 'code': None}}

The messages I was sending looked like this:

[{'content': 'Weather in San Fran', 'role': 'user'},
 {'content': '', 'role': 'assistant'},
 {'role': 'assistant',
  'tool_calls': [{'function': {'arguments': '{"city": "San Francisco"}',
                               'name': 'get_weather'},
                  'id': 'call_f1JCjCgvNcpzPBxG8UGGikWu',
                  'type': 'function'}]},
 {'content': '', 'role': 'user'},
 {'content': 'The weather in San Francisco is fine.',
  'role': 'tool',
  'tool_call_id': 'call_f1JCjCgvNcpzPBxG8UGGikWu'}]

I think the error is caused by {'content': '', 'role': 'user'}.

May 12 '25 00:05 simonw

OK, I now have a working tool implementation against both OpenAI and Ollama - at least for the streaming, synchronous case.

May 12 '25 01:05 simonw

Playing with this dangerous example (exec!):

import llm
model = llm.get_model("gpt-4.1-mini")

def exec_python(code: str) -> str:
    """Evaluate Python code and return anything output using print"""
    import io
    import sys
    # Redirect stdout to capture print outputs
    old_stdout = sys.stdout
    captured_output = io.StringIO()
    sys.stdout = captured_output
    try:
        # Execute the code
        exec(code)
        # Get the captured output
        output = captured_output.getvalue()
    finally:
        # Restore the original stdout
        sys.stdout = old_stdout
    return output

prompt = "Write and execute code to print a 40x40 ascii-art mandelbrot, do not import anything extra"

conversation = model.conversation()
for s in conversation.chain(prompt, tools=[llm.Tool.function(exec_python)]).details():
    print(s, end="", flush=True)

I added that .details() method to output debug information on tool calls and their responses.

May 12 '25 01:05 simonw

This is fun:

import llm
model = llm.get_model("gpt-4.1-mini")

def search_images(q: str) -> str:
    """Search for images on my blog for the given single word query."""
    import httpx

    response = httpx.get("https://simonwillison.net/dashboard/search-image-srcs.json?search=" + q)
    response.raise_for_status()
    return response.json()

prompt = "Described some pelican images used on my blog"

conversation = model.conversation()
for s in conversation.chain(prompt, tools=[llm.Tool.function(search_images)]).details():
    print(s, end="", flush=True)

Output:

Tool call requested: search_images(q=pelican)

{"title": "Search image SRCs", "queries": [{"sql": "SEL...

Then:

Here are some pelican images used on your blog along with their descriptions and links:

A pelican image titled "Trying out QvQ - Qwen's new visual reasoning model," showing pelicans on bicycles. Link: https://simonwillison.net/e/8684 Images:

https://static.simonwillison.net/static/2024/count-pelicans-easy.jpg

https://static.simonwillison.net/static/2024/pelicans-on-bicycles-veo2.jpg

A pelican image used in "You can now run prompts against images, audio and video in your terminal using LLM." Link: https://simonwillison.net/e/8582 Image: https://static.simonwillison.net/static/2024/pelican.jpg

An image titled "Initial impressions of GPT-4.5" featuring a pelican. Link: https://simonwillison.net/e/8810 Image: https://static.simonwillison.net/static/2025/pelican-gpt45.jpg

Image from the entry "First impressions of the new Amazon Nova LLMs (via a new llm-bedrock plugin)" showing pelicans. Link: https://simonwillison.net/e/8645 Image: https://static.simonwillison.net/static/2024/pelicans.jpg

Two pelicans image used in "Notes on Google's Gemma 3." Link: https://simonwillison.net/e/8847 Images:

https://static.simonwillison.net/static/2025/two-pelicans.jpg

https://static.simonwillison.net/static/2025/gemma-3-pelican.svg

"GPT-4.1: Three new million token input models from OpenAI, including their cheapest model yet" with pelicans. Link: https://simonwillison.net/e/8857 Images:

https://static.simonwillison.net/static/2025/gpt-4.1-pelican.jpg

https://static.simonwillison.net/static/2025/two-pelicans.jpg

A pelican bicycle illustration in "I can now run a GPT-4 class model on my laptop." Link: https://simonwillison.net/e/8647 Image: https://static.simonwillison.net/static/2024/pelican-bicycle-llama.svg

"Gemini 2.0 Flash: An outstanding multi-modal LLM with a sci-fi streaming mode" featuring pelicans. Link: https://simonwillison.net/e/8678 Images:

https://static.simonwillison.net/static/2024/pelicans.jpg

https://static.simonwillison.net/static/2024/pelican-bbox.jpg

"I built an automaton called Squadron" with pelican images. Link: https://simonwillison.net/e/8844 Images:

https://static.simonwillison.net/static/2025/two-pelicans.jpg

https://static.simonwillison.net/static/2025/notes-pelican.jpg

"Qwen2.5-Coder-32B is an LLM that can code well that runs on my Mac" with a pelican illustration. Link: https://simonwillison.net/e/8587 Image: https://static.simonwillison.net/static/2024/qwen-pelican.svg

There are several other pelican images used in various blog entries covering topics like Midjourney, DALL-E image generation, GPT tokenizers, Gemini 2.5 Pro, and more. Let me know if you want details about any specific image or entry!

May 12 '25 01:05 simonw

Looking at this code:

conversation = model.conversation()
for s in conversation.chain(prompt, tools=[llm.Tool.function(search_images)]).details():
    print(s, end="", flush=True)

I think I want a model.chain() method which actually just creates a conversation and forwards on to it.

And I want to be able to say tools=[search_images] and have that get automatically wrapped in llm.Tool.function().

May 12 '25 01:05 simonw

Got this simpler version working instead:

import llm
model = llm.get_model("gpt-4.1-mini")

def search_images(q: str) -> str:
    """Search for images on my blog for the given single word query."""
    import httpx

    response = httpx.get("https://simonwillison.net/dashboard/search-image-srcs.json?search=" + q)
    response.raise_for_status()
    return response.json()

prompt = "Described first three pelican images used on my blog"

for s in model.chain(prompt, tools=[search_images]):
    print(s, end="", flush=True)

May 12 '25 01:05 simonw

This will do for the moment.

May 12 '25 01:05 simonw

llm llm copied to clipboard

Ability to "reply" to a tool-response with a prompt carrying those tool results

llm
llm copied to clipboard