pydantic-ai
pydantic-ai copied to clipboard
Support `ollama`
Please! ;)
Many of the models ollama supports seem to also be supported by Groq, see https://ai.pydantic.dev/api/models/groq/.
Can you explain why need you need ollama as well as groq?
Ollama is so you don't have to rely on vendors and can keep things local. So Ollama supports use-cases that require more security and locality.
Agree with @abtawfik + ollama is ideal for development. Keeping the costs low, less mocking, cheap testing.
I think this needed due to privacy/data security purposes, not model availability. Lots of corporate environments completely block access to external AI/LLM services due to infosec concerns.
Yes, this is unusable for us without support for totally private & local usage via ollama.
My company cannot use any network call to ai for analysis of code written in house because of security.
👍 @samuelcolvin, as others have stated, Ollama helps you run LLMs locally/privately. Please add support for it.
Definite demand for this, here is the 2nd highest ranked post in r/localllama in the past 30 days
It's exactly what this project is doing ATM
https://www.reddit.com/r/LocalLLaMA/s/FHbdjdo7J3
Ollama supports the OpenAI SDK fwiw
Okay, happy to support it, especially if we can reuse some of the openai integration.
PR welcome
It is posible to use ollama right now.
Run ollama and set the AsyncOpenAI client with the ollama url "http://localhost:11434/v1"(if local), set the model name and that's all folks
from pydantic_ai import Agent
from pydantic import BaseModel
from openai import AsyncOpenAI
from pydantic_ai.models.openai import OpenAIModel
class CityLocation(BaseModel):
city: str
country: str
client = AsyncOpenAI(
base_url='http://localhost:11434/v1',
api_key='your-api-key',
)
model = OpenAIModel('qwen2.5-coder:latest', openai_client=client)
agent = Agent(model,result_type=CityLocation)
result = agent.run_sync('Where the olympics held in 2012?')
print(result.data)
#> city='London' country='United Kingdom'
print(result.cost())
#> Cost(request_tokens=56, response_tokens=8, total_tokens=64, details=None)
The formated response work well with qwen 2.5 coder models, even with 7b. Other models like mistral-nemo fails.
I want this!
I'd be happy to give this a crack this weekend unless someone else is already cracking on?
I also agree - Ollama is important to support work with internal data that cannot be sent to external parties.
I'd be happy to give this a crack this weekend unless someone else is already cracking on?
@benomahony go for it.
I think we need to:
- add a
OllamaModelmodel which just mostly uses the existing openai model - add docs on how to use this
Custom client solves the issue since Ollama aims to be OpenAI compat: https://ollama.com/blog/openai-compatibility
I've tried running PydanticAI with Ollama, and it's awesome. Thanks @JojoJr24, I took your code and tweaked it slightly.
Really looking forward to a nice wrapper for running this without a custom AsyncOpenAI.
https://github.com/user-attachments/assets/34c83d9e-8e4c-48eb-bac1-4ef0ef978567
Code
from pydantic_ai import Agent
from pydantic import BaseModel
from openai import AsyncOpenAI
from pydantic_ai.models.openai import OpenAIModel
from devtools import debug
class CityLocation(BaseModel):
city: str
country: str
client = AsyncOpenAI(base_url='http://localhost:11434/v1', api_key='-')
model = OpenAIModel('llama3.2:latest', openai_client=client)
agent = Agent(model, result_type=CityLocation)
result = agent.run_sync('Where the olympics held in 2012?')
debug(result.data)
debug(result.cost())
Hey @samuelcolvin @benomahony ,
I actually just raised a PR for this here (sorry, I didn't see the message saying you were going to try it!). There are a few questions I have put in the PR description, but otherwise I think it works OK
No drama @cal859! Will take a peek tomorrow but if this is solved then I'll see if I can pick up something else! 🤣
For future folks who land here. An example using the new OllamaModel wrapper with a remote ollama service.
from pydantic import BaseModel
from pydantic_ai import Agent
from pydantic_ai.models.ollama import OllamaModel
ollama_model = OllamaModel(
model_name='qwen2.5-coder:7b',
base_url='http://192.168.1.74:11434/v1'
)
class CityLocation(BaseModel):
city: str
country: str
agent = Agent(model=ollama_model, result_type=CityLocation)
result = agent.run_sync('In what city and country were the olympics held in 2012?')
print(result.data)
#> city='London' country='United Kingdom'
print(result.cost())
#> Cost(request_tokens=56, response_tokens=8, total_tokens=64, details=None)
Thanks @frodopwns. Was this not obvious from the docs / examples? I can see it is maybe not that obvious from the example I added here.
@samuelcolvin Should I create a small PR adding the example @frodopwns has shared to the docs and/or updating the existing docs to make the example more clear for how to work with remote servers?
happy to add another example.
@samuelcolvin @frodopwns PR here
@cal859 Thanks adding the example! I think it was a little confusing for a newcomer to the project to figure out just how to set the base_url. ChatGPT knew how to do it though so it must be made clear somewhere I didn't find. Lazy Googling on my part most likely!
For future folks who land here. An example using the new OllamaModel wrapper with a remote ollama service.
from pydantic import BaseModel from pydantic_ai import Agent from pydantic_ai.models.ollama import OllamaModel
ollama_model = OllamaModel( model_name='qwen2.5-coder:7b', base_url='http://192.168.1.74:11434/v1' )
class CityLocation(BaseModel): city: str country: str
agent = Agent(model=ollama_model, result_type=CityLocation)
result = agent.run_sync('In what city and country were the olympics held in 2012?') print(result.data) #> city='London' country='United Kingdom' print(result.cost()) #> Cost(request_tokens=56, response_tokens=8, total_tokens=64, details=None)
There is no module name 'pydantic_ai.models.ollama'
from pydantic_ai.models.ollama import OllamaModel
ModuleNotFoundError: No module named 'pydantic_ai.models.ollama'
Also the base_url has now been removed from OpenAIModel which means these docs are out of date: https://ai.pydantic.dev/models/#example-local-usage
I'm happy to fix something here with a contribution but would be good to understand how Pydantic want to take this feature/integration forward?
@samuelcolvin 🙇
@benomahony what are you talking about?
https://github.com/pydantic/pydantic-ai/blob/0a5c40d3b5014245c95ea8f043566a0bbe1b48b2/pydantic_ai_slim/pydantic_ai/models/openai.py#L80
I'm talking absolute nonsense apparently! I blame my lsp earlier 🤦
For me, the following worked
` from pydantic_ai.models.openai import OpenAIModel from pydantic_ai.providers.openai import OpenAIProvider from pydantic_ai import Agent, RunContext from httpx import AsyncClient
custom_http_client = AsyncClient(timeout=30) the_model = OpenAIModel('qwen3:latest', provider=OpenAIProvider(base_url='http://localhost:11434/v1/', http_client=custom_http_client)) agent = Agent(model=the_model, system_prompt=['Reply in one sentence']) response = agent.run_sync('What is the capital of Japan?') print(response.data)
`
I hope that helps