ollama docs: added Ollama Operator into README.md as one of community projects

docs: added Ollama Operator into README.md as one of community projects

Open nekomeowww opened this issue 1 year ago • 1 comments

trafficstars

First of all, huge thank you to all the wonderful work and awesome contributors to you folks for both ollama, llama.cpp and researchers worked on it.

It made it easy for us to deploy and host our own large language model.

Summary

I'm Neko Ayaka (https://github.com/nekomeowww) from China. I'm currently working as a full-stack senior developer at @DaoCloud, diving deep into cloud native tech, AI, and UI/UX design.

Ollama is decent to deploy as single instance on machines like Macbook, Mac Studio. I'm really pumped about how we designed the ollama cli. It keeps things simple for users, much like what Docker does (it reminds me Docker CLI every time I use it). And the implementation of Modelfile in the project really streamlines the process, again reminds me Dockerfile.

Additionally, I researched the users needs, and interested in over the official Discord server of Ollama, found that many users are researching to find a way to deploy it concurrently.

Therefore, inspired by the awesome user experience with ollama, I wanted to bring that same vibe to my own Kubernetes setup at home. That's where the idea for this project started, fueled by conversations with friends who are also into cloud native projects.

This is the open-source project called Ollama Operator (GitHub: https://github.com/nekomeowww/ollama-operator, documentation site: https://ollama-operator.ayaka.io/pages/en/ ) I want to introduce and add to README of Ollama here.

It's built around a Kubernetes operator concept by leveraging both ollama pull and ollama serve to make it possible to deploy multiple instances of ollama serve to inference multiple models across nodes of clusters by introducing Model CRD, like this:

apiVersion: ollama.ayaka.io/v1
kind: Model
metadata:
  name: phi
spec:
  # Scale the model to 2 replicas
  replicas: 2
  # Use the model image `phi`
  image: phi

I've mapped out the specs and CRDs for deploying ollama instances on Kubernetes. There are still a few kinks to iron out, but it's looking good as a proof of concept. I've put together all the docs, the architectural design, and even got it up on a neat documentation site powered by VitePress!

Besides Ollama Operator, and the introduced Model CRD to simplify the process of multi-instance deployment, I've also made a CLI tool called kollama (source at: https://github.com/nekomeowww/ollama-operator/tree/main/cmd/kollama , documentation site: https://ollama-operator.ayaka.io/pages/en/references/cli/commands/deploy.html ) to even simplify the interactions with introduced Model CRD with just a single command, like this:

kollama deploy phi --expose

I have Ollama Operator running on our team's dedicated server, my own development kind cluster on my Macbook, and the general K8s cluster on my Homelab, little testing k3s cluster on my two Respberry Pi. They are running smoothly for fine for the past 10 days, and I consider it is a general available project for users to try out and feedback to me to improve the project continously.

There are still much work to do and many things worth me to research and test, I really want to share the concepts, design, simplified concepts with Kubernets.

Proposal

Add and include Ollama Operator into Ollama's README.md file as one of the community driven project.

Apr 21 '24 06:04 nekomeowww

Very cool project. 🚀

Apr 21 '24 13:04 secondtruth

ollama ollama copied to clipboard

docs: added Ollama Operator into README.md as one of community projects

Summary

Proposal

ollama
ollama copied to clipboard