Simon Willison comments

Results 2637 comments of


                                            Simon Willison

How to continue a conversation with more images?

Looks like there's new code for chat in this branch: https://github.com/Blaizzy/mlx-vlm/tree/pc/video - e.g. https://github.com/Blaizzy/mlx-vlm/commit/810fb532c873054bdcb35998719538732a99a5f1

How to continue a conversation with more images?

Here's a concrete example of how I'd like to be able to use `mlx-vlm` taken from my new `llm-mlx` plugin: https://github.com/simonw/llm-mlx/blob/01fa4ed83deab763af2d05ea2594ce857eeae532/llm_mlx.py#L76-L105 More information on that here: https://simonwillison.net/2025/Feb/15/llm-mlx/

Mechanism for attaching tool execution requests to a Response

The hard part here will be dealing with streaming. Here's what OpenAI does there, from https://platform.openai.com/docs/guides/function-calling?api-mode=responses#streaming ``` {"type":"response.output_item.added","response_id":"resp_1234xyz","output_index":0,"item":{"type":"function_call","id":"fc_1234xyz","call_id":"call_1234xyz","name":"get_weather","arguments":""}} {"type":"response.function_call_arguments.delta","response_id":"resp_1234xyz","item_id":"fc_1234xyz","output_index":0,"delta":"{\""} {"type":"response.function_call_arguments.delta","response_id":"resp_1234xyz","item_id":"fc_1234xyz","output_index":0,"delta":"location"} {"type":"response.function_call_arguments.delta","response_id":"resp_1234xyz","item_id":"fc_1234xyz","output_index":0,"delta":"\":\""} {"type":"response.function_call_arguments.delta","response_id":"resp_1234xyz","item_id":"fc_1234xyz","output_index":0,"delta":"Paris"} {"type":"response.function_call_arguments.delta","response_id":"resp_1234xyz","item_id":"fc_1234xyz","output_index":0,"delta":","} {"type":"response.function_call_arguments.delta","response_id":"resp_1234xyz","item_id":"fc_1234xyz","output_index":0,"delta":" France"} {"type":"response.function_call_arguments.delta","response_id":"resp_1234xyz","item_id":"fc_1234xyz","output_index":0,"delta":"\"}"} {"type":"response.function_call_arguments.done","response_id":"resp_1234xyz","item_id":"fc_1234xyz","output_index":0,"arguments":"{\"location\":\"Paris, France\"}"} {"type":"response.output_item.done","response_id":"resp_1234xyz","output_index":0,"item":{"type":"function_call","id":"fc_1234xyz","call_id":"call_2345abc","name":"get_weather","arguments":"{\"location\":\"Paris, France\"}"}}...

Mechanism for attaching tool execution requests to a Response

Ugh, I also need to decide if/how I'm going to support multiple parallel tool call requests, which are a thing for at least OpenAI and Gemini: - https://platform.openai.com/docs/guides/function-calling/parallel-function-calling?api-mode=chat#parallel-function-calling - https://ai.google.dev/gemini-api/docs/function-calling?example=meeting#parallel_function_calling