Simon Willison comments

Results 2418 comments of


                                            Simon Willison

Annotations abstraction for responses that are not just a stream of plain text

The OpenAI web search stuff needs this too: - #837 Example from https://platform.openai.com/docs/guides/tools-web-search?api-mode=chat&lang=curl#output-and-citations ```json [ { "index": 0, "message": { "role": "assistant", "content": "the model response is here...", "refusal": null,...

Annotations abstraction for responses that are not just a stream of plain text

OpenAI example (including streaming) here: - https://github.com/simonw/llm/issues/837#issuecomment-2734976520

Annotations abstraction for responses that are not just a stream of plain text

Here's a challenge: in streaming mode OpenAI only returns the annotations at the very end - but I'll already have printed the text out to the screen by the time...

Annotations abstraction for responses that are not just a stream of plain text

Let's look at what Anthropic does for streaming citations. Without streaming: ```bash curl https://api.anthropic.com/v1/messages \ -H "content-type: application/json" \ -H "x-api-key: $(llm keys get anthropic)" \ -H "anthropic-version: 2023-06-01" \...

Annotations abstraction for responses that are not just a stream of plain text

I pushed my prototype so far - the one dodgy part of it is that I got Claude to rewrite the `logs_list` command to use `Response.from_row()` in order to test...

Annotations abstraction for responses that are not just a stream of plain text

Current TODO list: - Need to think about how to handle that streaming case, both as a Python API and from how plugins should handle that. Currently plugins yield strings...

Annotations abstraction for responses that are not just a stream of plain text

if we *did* start optionally yielding `Chunk()` from the `execute()` method (and its async variant) we could teach the `Response.chunks()` method to yield chunks as they become available. In terms...

Annotations abstraction for responses that are not just a stream of plain text

I think I like `chunk.annotation` more than `chunk.data` for the optional `dict` of data attached to a chunk. I'll leave it as `annotation.data` though because `annotation.annotation` is a bit weird.

Annotations abstraction for responses that are not just a stream of plain text

This feature may be the point at which I need a `llm prompt --json` option which outputs JSON instead of text. This could work using newline-delimited JSON for streaming mode...

More clarity on id-token: write

Related issue full of confused people: - https://github.com/github/docs/issues/14626#issuecomment-1472892737