aspire
aspire copied to clipboard
[Components] Add Ollama Component
TLDR: Add Ollama Component to Aspire similar to the OpenAI component.
Configuration
You can run Ollama using containers. The general process is as follows:
- Pull Ollama image
-
GPU
docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
-
CPU
docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
- Download the model
docker exec -it ollama ollama run llama2
For more details, see the blog announcement.
In Aspire, this roughly translates into the following code inside the AppHost project:
var ollama =
builder.AddContainer("ollama", "ollama/ollama")
.WithVolumeMount("ollama", "/root/.ollama", VolumeMountType.Bind)
.WithVolumeMount("./ollamaconfig", "/usr/config", VolumeMountType.Bind)
.WithHttpEndpoint(hostPort: 11434, name: "ollama")
.WithEntrypoint("/usr/config/entrypoint.sh");
The custom Entrypoint is to account for the second step which downloads the model. In this case, the entrypoint consists of a few scripts.
entrypoint.sh
#!/bin/bash
# Run model script
/usr/config/pullmodel.sh &
#Get model
/bin/ollama serve
pullmodel.sh
#!/bin/bash
errcode=1
start_time=$SECONDS
end_by=$((start_time + 60))
echo "Starting check for ollama start-up at $start_time, will end at $end_by"
while [[ $SECONDS -lt $end_by && ($errcode -ne 0)]]; do
ollama list
errcode=$?
sleep 1
done
elapsed_time=$((SECONDS - start_time))
echo "Stopped checking for ollama start-up after $elapsed_time seconds (errcode=$errcode, seconds=$SECONDS)"
# Pull the model
ollama pull phi3
Proposal
Simplify configuration of Ollama with a custom resource type / component.
After installing the Aspire.Hosting.Ollama
package, a user would configure an Ollama resource as follows:
var ollama = builder.AddOllama()
To configure models, use the following:
var ollama =
builder
.AddOllama()
.WithModel("phi3")
.WithModel("llama3");
Consumption
Inside the application that references the Ollama resource, consumption looks similar to the following in the context of Semantic Kernel:
builder.Services.AddOllamaChatCompletion(
modelId: Environment.GetEnvironmentVariable("OLLAMA_MODELID"),
baseUri: new Uri(Environment.GetEnvironmentVariable("OLLAMA_ENDPOINT")));
Deployment
When running locally, you can use either the background service or run the container.
However, when deploying to production, users have to manually configure the infrastructure and environments for their Ollama container.
Proposal
Similar to the [OpenAI] component, enable deployment to Azure using
var ollama = builder.ExecutionContext.IsPublishMode
? builder.AddOllama().WithModel("phi3")
: builder.AddConnectionString("your-existing-resource");
When deploying using tools like azd.
For this process, you could take advantage of existing patterns when deploying to ACA and either provision a preconfigured service.
In this deployment, you could also provide uses with the option to use hardware acceleration for inference using GPU instances.
Also include the UI as a WithOllamaUI
like we do RedisCommander and friends.
@SteveSandersonMS hacked together a system that downloads the models by hitting an http endpoint on the resource. That’s a bit cleaner and easier to maintain than the entry point script.
For Azure deployments, is it possible to use Bicep to deploy these kinds of models? That'd be a more optimal path for hosting in Azure than deploying the ollama container image
A crossover you might want to consider is integrating it with Semantic Kernel as a connector as part of the demo for this. It could provide good exposure in both directions. Example 1 or Example 2
I also had a need for such an Ollama Aspire component for a recent project. I ended up implementing and releasing it as a standalone package unrelated to my project in case others wanted to make use of it, as I saw there wasn't an existing one.
It has support for downloading a model and displaying that download progress in the State column of the orchestrator table. It does not however enable GPU acceleration (yet) or some of the other things mentioned in this thread. It has default configuration to download llama3, and is excluded from the manifest - not sure if that's what people would expect from such component.
Happy for that to be absorbed into this aspire repo - if that even make sense. Or not - if there are already plans to implement a better officially supported component here. Regardless, I was thinking it would be best for there to be a consistent component of this nature that all other components that need Ollama would use. This is to avoid a local dev environment ending up with multiple bespoke Ollama containers potentially with duplicate downloaded models, each being used by different services.
Happy for that to be absorbed into this aspire repo
@QuantumNightmare Had you started on any draft PR of a resource here? I had started one and was using pull model as well...
var ollama = builder.AddOllama("ollama")
.AddModel("llama2")
.AddModel("phi3")
.WithDataVolume("ollama")
.WithOpenWebUI();
builder.AddProject<Projects.Ollama_ApiService>("apiservice")
.WithReference(ollama);
@timheuer Nope, I have not started anything around adding this to the aspire repo, so feel free to proceed with what you've already started. Let me know if you have any questions about what I'd done in the GitHub repo linked to the NuGet package that I'd linked above.
@davidfowl is this going to be integration going to be community driven? Like what Raygun has done https://github.com/MindscapeHQ/Raygun.Aspire.Hosting.Ollama
@davidfowl is this going to be integration going to be community driven? Like what Raygun has done https://github.com/MindscapeHQ/Raygun.Aspire.Hosting.Ollama
@maddymontaquila and @aaronpowell have been discussing.