pydantic-ai icon indicating copy to clipboard operation
pydantic-ai copied to clipboard

Support `ollama`

Open barseghyanartur opened this issue 1 year ago • 19 comments

Please! ;)

barseghyanartur avatar Dec 02 '24 12:12 barseghyanartur

Many of the models ollama supports seem to also be supported by Groq, see https://ai.pydantic.dev/api/models/groq/.

Can you explain why need you need ollama as well as groq?

samuelcolvin avatar Dec 02 '24 19:12 samuelcolvin

Ollama is so you don't have to rely on vendors and can keep things local. So Ollama supports use-cases that require more security and locality.

abtawfik avatar Dec 02 '24 19:12 abtawfik

Agree with @abtawfik + ollama is ideal for development. Keeping the costs low, less mocking, cheap testing.

barseghyanartur avatar Dec 02 '24 22:12 barseghyanartur

I think this needed due to privacy/data security purposes, not model availability. Lots of corporate environments completely block access to external AI/LLM services due to infosec concerns.

gusutabopb avatar Dec 03 '24 10:12 gusutabopb

Yes, this is unusable for us without support for totally private & local usage via ollama.

My company cannot use any network call to ai for analysis of code written in house because of security.

Lambda-Logan avatar Dec 03 '24 16:12 Lambda-Logan

👍 @samuelcolvin, as others have stated, Ollama helps you run LLMs locally/privately. Please add support for it.

curiousily avatar Dec 03 '24 16:12 curiousily

Definite demand for this, here is the 2nd highest ranked post in r/localllama in the past 30 days

It's exactly what this project is doing ATM

https://www.reddit.com/r/LocalLLaMA/s/FHbdjdo7J3

RDT_20241203_1720565923725361787596228.jpg

Lambda-Logan avatar Dec 03 '24 16:12 Lambda-Logan

Ollama supports the OpenAI SDK fwiw

arcaputo3 avatar Dec 03 '24 22:12 arcaputo3

Okay, happy to support it, especially if we can reuse some of the openai integration.

PR welcome

samuelcolvin avatar Dec 04 '24 08:12 samuelcolvin

It is posible to use ollama right now.

Run ollama and set the AsyncOpenAI client with the ollama url "http://localhost:11434/v1"(if local), set the model name and that's all folks

from pydantic_ai import Agent
from pydantic import BaseModel
from openai import  AsyncOpenAI
from pydantic_ai.models.openai import OpenAIModel

class CityLocation(BaseModel):
    city: str
    country: str


client = AsyncOpenAI(
    base_url='http://localhost:11434/v1',
    api_key='your-api-key',
)

model = OpenAIModel('qwen2.5-coder:latest', openai_client=client)
agent = Agent(model,result_type=CityLocation)

result = agent.run_sync('Where the olympics held in 2012?')
print(result.data)
#> city='London' country='United Kingdom'
print(result.cost())
#> Cost(request_tokens=56, response_tokens=8, total_tokens=64, details=None)

JojoJr24 avatar Dec 04 '24 16:12 JojoJr24

The formated response work well with qwen 2.5 coder models, even with 7b. Other models like mistral-nemo fails.

JojoJr24 avatar Dec 04 '24 17:12 JojoJr24

I want this!

HuronExplodium avatar Dec 05 '24 07:12 HuronExplodium

I'd be happy to give this a crack this weekend unless someone else is already cracking on?

benomahony avatar Dec 05 '24 12:12 benomahony

I also agree - Ollama is important to support work with internal data that cannot be sent to external parties.

EricBLivingston avatar Dec 06 '24 13:12 EricBLivingston

I'd be happy to give this a crack this weekend unless someone else is already cracking on?

@benomahony go for it.

I think we need to:

  • add a OllamaModel model which just mostly uses the existing openai model
  • add docs on how to use this

samuelcolvin avatar Dec 06 '24 16:12 samuelcolvin

Custom client solves the issue since Ollama aims to be OpenAI compat: https://ollama.com/blog/openai-compatibility

pySilver avatar Dec 06 '24 20:12 pySilver

I've tried running PydanticAI with Ollama, and it's awesome. Thanks @JojoJr24, I took your code and tweaked it slightly.

Really looking forward to a nice wrapper for running this without a custom AsyncOpenAI.

https://github.com/user-attachments/assets/34c83d9e-8e4c-48eb-bac1-4ef0ef978567

Code
from pydantic_ai import Agent
from pydantic import BaseModel
from openai import AsyncOpenAI
from pydantic_ai.models.openai import OpenAIModel

from devtools import debug


class CityLocation(BaseModel):
    city: str
    country: str


client = AsyncOpenAI(base_url='http://localhost:11434/v1', api_key='-')
model = OpenAIModel('llama3.2:latest', openai_client=client)
agent = Agent(model, result_type=CityLocation)

result = agent.run_sync('Where the olympics held in 2012?')
debug(result.data)
debug(result.cost())

samuelcolvin avatar Dec 06 '24 21:12 samuelcolvin

Hey @samuelcolvin @benomahony ,

I actually just raised a PR for this here (sorry, I didn't see the message saying you were going to try it!). There are a few questions I have put in the PR description, but otherwise I think it works OK

cal859 avatar Dec 06 '24 22:12 cal859

No drama @cal859! Will take a peek tomorrow but if this is solved then I'll see if I can pick up something else! 🤣

benomahony avatar Dec 06 '24 22:12 benomahony

For future folks who land here. An example using the new OllamaModel wrapper with a remote ollama service.

from pydantic import BaseModel
from pydantic_ai import Agent
from pydantic_ai.models.ollama import OllamaModel

ollama_model = OllamaModel(
    model_name='qwen2.5-coder:7b',
    base_url='http://192.168.1.74:11434/v1'
)

class CityLocation(BaseModel):
    city: str
    country: str


agent = Agent(model=ollama_model, result_type=CityLocation)

result = agent.run_sync('In what city and country were the olympics held in 2012?')
print(result.data)
#> city='London' country='United Kingdom'
print(result.cost())
#> Cost(request_tokens=56, response_tokens=8, total_tokens=64, details=None)

frodopwns avatar Dec 10 '24 22:12 frodopwns

Thanks @frodopwns. Was this not obvious from the docs / examples? I can see it is maybe not that obvious from the example I added here.

@samuelcolvin Should I create a small PR adding the example @frodopwns has shared to the docs and/or updating the existing docs to make the example more clear for how to work with remote servers?

cal859 avatar Dec 10 '24 22:12 cal859

happy to add another example.

samuelcolvin avatar Dec 10 '24 22:12 samuelcolvin

@samuelcolvin @frodopwns PR here

cal859 avatar Dec 10 '24 22:12 cal859

@cal859 Thanks adding the example! I think it was a little confusing for a newcomer to the project to figure out just how to set the base_url. ChatGPT knew how to do it though so it must be made clear somewhere I didn't find. Lazy Googling on my part most likely!

frodopwns avatar Dec 11 '24 03:12 frodopwns

For future folks who land here. An example using the new OllamaModel wrapper with a remote ollama service.

from pydantic import BaseModel from pydantic_ai import Agent from pydantic_ai.models.ollama import OllamaModel

ollama_model = OllamaModel( model_name='qwen2.5-coder:7b', base_url='http://192.168.1.74:11434/v1' )

class CityLocation(BaseModel): city: str country: str

agent = Agent(model=ollama_model, result_type=CityLocation)

result = agent.run_sync('In what city and country were the olympics held in 2012?') print(result.data) #> city='London' country='United Kingdom' print(result.cost()) #> Cost(request_tokens=56, response_tokens=8, total_tokens=64, details=None)

There is no module name 'pydantic_ai.models.ollama'

from pydantic_ai.models.ollama import OllamaModel
ModuleNotFoundError: No module named 'pydantic_ai.models.ollama'

Badhansen avatar Feb 05 '25 17:02 Badhansen

Also the base_url has now been removed from OpenAIModel which means these docs are out of date: https://ai.pydantic.dev/models/#example-local-usage

I'm happy to fix something here with a contribution but would be good to understand how Pydantic want to take this feature/integration forward?

@samuelcolvin 🙇

benomahony avatar Feb 05 '25 18:02 benomahony

@benomahony what are you talking about?

https://github.com/pydantic/pydantic-ai/blob/0a5c40d3b5014245c95ea8f043566a0bbe1b48b2/pydantic_ai_slim/pydantic_ai/models/openai.py#L80

samuelcolvin avatar Feb 05 '25 21:02 samuelcolvin

I'm talking absolute nonsense apparently! I blame my lsp earlier 🤦

benomahony avatar Feb 05 '25 22:02 benomahony

For me, the following worked

` from pydantic_ai.models.openai import OpenAIModel from pydantic_ai.providers.openai import OpenAIProvider from pydantic_ai import Agent, RunContext from httpx import AsyncClient

custom_http_client = AsyncClient(timeout=30) the_model = OpenAIModel('qwen3:latest', provider=OpenAIProvider(base_url='http://localhost:11434/v1/', http_client=custom_http_client)) agent = Agent(model=the_model, system_prompt=['Reply in one sentence']) response = agent.run_sync('What is the capital of Japan?') print(response.data)

`

I hope that helps

dpappas avatar Jul 24 '25 16:07 dpappas