llm
llm copied to clipboard
Make it possible to disable the openai plugin
trafficstars
Based on my quick and dirty experiments on a 2021 MacBook Pro, the openai plugin imports add hundreds of milliseconds of runtime even when a different model (e.g. Gemini) is used. This is not impacted by setting LLM_LOAD_PLUGINS. If I replace DEFAULT_PLUGINS with an empty tuple I can measure the difference:
Typical before:
$ time llm -m gemini-2.0-flash "Repeat this: For initial testing, you can hard code an API key, but this should only be temporary since it is not secure. The rest of this section goes through how to set up your API key locally as an environment variable with different operating systems."
For initial testing, you can hard code an API key, but this should only be temporary since it is not secure. The rest of this section goes through how to set up your API key locally as an environment variable with different operating systems.
real 0m1.185s
user 0m0.367s
sys 0m0.093s
Typical after:
$ time uv run -m llm -m gemini-2.0-flash "Repeat this: For initial testing, you can hard code an API key, but this should only be temporary since it is not secure. The rest of this section goes through how to set up your API key locally as an environment variable with different operating systems."
For initial testing, you can hard code an API key, but this should only be temporary since it is not secure. The rest of this section goes through how to set up your API key locally as an environment variable with different operating systems.
real 0m0.926s
user 0m0.211s
sys 0m0.054s
I found this looking for ways to try to close the gap between a direct curl of the API (where I get ~600-700 ms) and when I call llm and get roughly double that. Based on profiling, the major cost comes simply from imports.
See https://github.com/simonw/llm/issues/848
Duplicate of #335.