aspire icon indicating copy to clipboard operation
aspire copied to clipboard

[Components] Add Ollama Component

Open luisquintanilla opened this issue 9 months ago • 3 comments

TLDR: Add Ollama Component to Aspire similar to the OpenAI component.

Configuration

You can run Ollama using containers. The general process is as follows:

  1. Pull Ollama image
  • GPU

    docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
    
  • CPU

    docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
    
  1. Download the model
docker exec -it ollama ollama run llama2

For more details, see the blog announcement.

In Aspire, this roughly translates into the following code inside the AppHost project:

var ollama =
    builder.AddContainer("ollama", "ollama/ollama")
        .WithVolumeMount("ollama", "/root/.ollama", VolumeMountType.Bind)
        .WithVolumeMount("./ollamaconfig", "/usr/config", VolumeMountType.Bind)
        .WithHttpEndpoint(hostPort: 11434, name: "ollama")
        .WithEntrypoint("/usr/config/entrypoint.sh");

The custom Entrypoint is to account for the second step which downloads the model. In this case, the entrypoint consists of a few scripts.

entrypoint.sh

#!/bin/bash

# Run model script
/usr/config/pullmodel.sh &

#Get model
/bin/ollama serve

pullmodel.sh

#!/bin/bash

errcode=1
start_time=$SECONDS
end_by=$((start_time + 60))

echo "Starting check for ollama start-up at $start_time, will end at $end_by"

while [[ $SECONDS -lt $end_by && ($errcode -ne 0)]]; do
	ollama list
	errcode=$?
	sleep 1
done

elapsed_time=$((SECONDS - start_time))

echo "Stopped checking for ollama start-up after $elapsed_time seconds (errcode=$errcode, seconds=$SECONDS)"

# Pull the model
ollama pull phi3

Proposal

Simplify configuration of Ollama with a custom resource type / component.

After installing the Aspire.Hosting.Ollama package, a user would configure an Ollama resource as follows:

var ollama = builder.AddOllama()

To configure models, use the following:

var ollama = 
    builder
        .AddOllama()
        .WithModel("phi3")
        .WithModel("llama3");

Consumption

Inside the application that references the Ollama resource, consumption looks similar to the following in the context of Semantic Kernel:

builder.Services.AddOllamaChatCompletion(
    modelId: Environment.GetEnvironmentVariable("OLLAMA_MODELID"),
    baseUri: new Uri(Environment.GetEnvironmentVariable("OLLAMA_ENDPOINT")));

Deployment

When running locally, you can use either the background service or run the container.

However, when deploying to production, users have to manually configure the infrastructure and environments for their Ollama container.

Proposal

Similar to the [OpenAI] component, enable deployment to Azure using

var ollama = builder.ExecutionContext.IsPublishMode
    ? builder.AddOllama().WithModel("phi3")
    : builder.AddConnectionString("your-existing-resource");

When deploying using tools like azd.

For this process, you could take advantage of existing patterns when deploying to ACA and either provision a preconfigured service.

In this deployment, you could also provide uses with the option to use hardware acceleration for inference using GPU instances.

luisquintanilla avatar May 14 '24 21:05 luisquintanilla

Also include the UI as a WithOllamaUI like we do RedisCommander and friends.

glennc avatar May 15 '24 04:05 glennc

@SteveSandersonMS hacked together a system that downloads the models by hitting an http endpoint on the resource. That’s a bit cleaner and easier to maintain than the entry point script.

davidfowl avatar May 15 '24 04:05 davidfowl

For Azure deployments, is it possible to use Bicep to deploy these kinds of models? That'd be a more optimal path for hosting in Azure than deploying the ollama container image

aaronpowell avatar May 22 '24 06:05 aaronpowell

A crossover you might want to consider is integrating it with Semantic Kernel as a connector as part of the demo for this. It could provide good exposure in both directions. Example 1 or Example 2

cisionmarkwalls avatar Jun 06 '24 20:06 cisionmarkwalls

I also had a need for such an Ollama Aspire component for a recent project. I ended up implementing and releasing it as a standalone package unrelated to my project in case others wanted to make use of it, as I saw there wasn't an existing one.

It has support for downloading a model and displaying that download progress in the State column of the orchestrator table. It does not however enable GPU acceleration (yet) or some of the other things mentioned in this thread. It has default configuration to download llama3, and is excluded from the manifest - not sure if that's what people would expect from such component.

Happy for that to be absorbed into this aspire repo - if that even make sense. Or not - if there are already plans to implement a better officially supported component here. Regardless, I was thinking it would be best for there to be a consistent component of this nature that all other components that need Ollama would use. This is to avoid a local dev environment ending up with multiple bespoke Ollama containers potentially with duplicate downloaded models, each being used by different services.

QuantumNightmare avatar Jun 13 '24 02:06 QuantumNightmare

Happy for that to be absorbed into this aspire repo

@QuantumNightmare Had you started on any draft PR of a resource here? I had started one and was using pull model as well...

var ollama = builder.AddOllama("ollama")
    .AddModel("llama2")
    .AddModel("phi3")
    .WithDataVolume("ollama")
    .WithOpenWebUI();

builder.AddProject<Projects.Ollama_ApiService>("apiservice")
    .WithReference(ollama);

timheuer avatar Jun 25 '24 02:06 timheuer

@timheuer Nope, I have not started anything around adding this to the aspire repo, so feel free to proceed with what you've already started. Let me know if you have any questions about what I'd done in the GitHub repo linked to the NuGet package that I'd linked above.

QuantumNightmare avatar Jun 25 '24 02:06 QuantumNightmare

@davidfowl is this going to be integration going to be community driven? Like what Raygun has done https://github.com/MindscapeHQ/Raygun.Aspire.Hosting.Ollama

LadyNaggaga avatar Sep 24 '24 21:09 LadyNaggaga

@davidfowl is this going to be integration going to be community driven? Like what Raygun has done https://github.com/MindscapeHQ/Raygun.Aspire.Hosting.Ollama

@maddymontaquila and @aaronpowell have been discussing.

timheuer avatar Sep 25 '24 01:09 timheuer