New OpenAI GPT-5.1 (and 5.1 Codex) models
- https://platform.openai.com/docs/models/gpt-5.1
- https://platform.openai.com/docs/models/gpt-5.1-chat-latest
- https://platform.openai.com/docs/models/gpt-5.1-codex
- https://platform.openai.com/docs/models/gpt-5.1-codex-mini
diff --git a/llm/default_plugins/openai_models.py b/llm/default_plugins/openai_models.py
index 94c1ffc..2c10877 100644
--- a/llm/default_plugins/openai_models.py
+++ b/llm/default_plugins/openai_models.py
@@ -182,6 +182,30 @@ def register_models(register):
supports_tools=True,
),
)
+ # GPT-5.1
+ for model_id in (
+ "gpt-5.1",
+ "gpt-5.1-chat-latest",
+ "gpt-5.1-codex",
+ "gpt-5.1-codex-mini",
+ ):
+ register(
+ Chat(
+ model_id,
+ vision=True,
+ reasoning=True,
+ supports_schema=True,
+ supports_tools=True,
+ ),
+ AsyncChat(
+ model_id,
+ vision=True,
+ reasoning=True,
+ supports_schema=True,
+ supports_tools=True,
+ ),
+ )
+
# The -instruct completion model
register(
Completion("gpt-3.5-turbo-instruct", default_max_tokens=256),
uv run llm -m gpt-5.1 -t pelican-svg
uv run llm -m gpt-5.1-chat-latest -t pelican-svg
uv run llm -m gpt-5.1-codex -t pelican-svg
uv run llm -m gpt-5.1-codex-mini -t pelican-svg
The first two worked: https://gist.github.com/simonw/fce773879bcd66e01df59a5011899532
It looks like the codex models require the Responses API.
Here's the GPT-5.1 pelican:
And for gpt-5.1-chat-latest:
Tried this instead:
uv run llm -m gpt-5.1 -t pelican-svg -o reasoning_effort high
https://gist.github.com/simonw/fd120c35d8d34512c6f15d8d8e5192ec
Just spotted this in https://openai.com/index/gpt-5-1-for-developers/
Developers can now use GPT‑5.1 without reasoning by setting reasoning_effort to 'none'.
Need to add that to this enum:
https://github.com/simonw/llm/blob/0526abeeea021b23552bcdafc15c278f2bb2e119/llm/default_plugins/openai_models.py#L456-L460
... or maybe we don't need to change that, since the default for 5.1 is none if you don't specify one.
Need to add support for extended prompt cache retention: https://platform.openai.com/docs/guides/prompt-caching#extended-prompt-cache-retention
To use extended caching with GPT‑5.1, add the parameter
"prompt_cache_retention='24h'"on the Responses or Chat Completions API. See the prompt caching docs for more detail.
What's a good next step here? The 5.1 chat (non-codex) models seem to be in good shape. Maybe pushing a release with those could unblock users while continuing to work on the codex integration? Thanks for all your work on this library.