llama-stack icon indicating copy to clipboard operation
llama-stack copied to clipboard

feat: Add MLflow Prompt Registry provider

Open williamcaban opened this issue 1 month ago • 5 comments

MLflow Prompt Registry Provider

Summary

This PR adds a new remote MLflow provider for the Prompts API, enabling centralized prompt management and versioning using MLflow's Prompt Registry (MLflow 3.4+).

What's New

Remote Provider: remote::mlflow

A production-ready provider that integrates Llama Stack's Prompts API with MLflow's centralized prompt registry, supporting:

  • Version Control: Immutable prompt versioning with full history
  • Default Version Management: Easy version switching via aliases
  • Auto Variable Extraction: Automatic detection of {{ variable }} placeholders
  • Centralized Storage: Team collaboration via shared MLflow server
  • Metadata Preservation: Llama Stack metadata stored as MLflow tags

Quick Start

1. Configure Llama Stack

Basic configuration with SQLite (default):

prompts:
  - provider_id: reference-prompts
    provider_type: inline::reference
    config:
      run_config:
        storage:
          stores:
            prompts:
              type: sqlite
              db_path: ./prompts.db

With PostgreSQL:

prompts:
  - provider_id: postgres-prompts
    provider_type: inline::reference
    config:
      run_config:
        storage:
          stores:
            prompts:
              type: postgres
              url: postgresql://user:pass@localhost/llama_stack

2. Use the Prompts API

from llama_stack_client import LlamaStackClient

client = LlamaStackClient(base_url="http://localhost:5000")

# Create a prompt
prompt = client.prompts.create(
    prompt="Summarize the following text in {{ num_sentences }} sentences:\n\n{{ text }}",
    variables=["num_sentences", "text"]
)
print(f"Created prompt: {prompt.prompt_id} (v{prompt.version})")

# Retrieve prompt
retrieved = client.prompts.get(prompt_id=prompt.prompt_id)
print(f"Retrieved: {retrieved.prompt}")

# Update prompt (creates version 2)
updated = client.prompts.update(
    prompt_id=prompt.prompt_id,
    prompt="Summarize in exactly {{ num_sentences }} sentences:\n\n{{ text }}",
    version=1,
    set_as_default=True
)
print(f"Updated to version: {updated.version}")

# List all prompts
prompts = client.prompts.list()
print(f"Found {len(prompts.data)} prompts")

# Delete prompt
client.prompts.delete(prompt_id=prompt.prompt_id)

williamcaban avatar Nov 16 '25 19:11 williamcaban

PR is now ready for review and includes the following updates:

  • Moved the previous prompts.py as an inline provider (see inline_reference.mdx for details)
  • Defined a remote provider with MLflow supporting authentication (see remote_mlflow.mdx for details)
  • Removed any dependencies on prompt caching

williamcaban avatar Nov 24 '25 02:11 williamcaban

@mattf

why do we need to maintain a mapping from prompt id to mlflow prompt name?

The idea is to distinguish Llama Stack-managed prompts from other prompts that might exist in the same MLflow registry.

williamcaban avatar Nov 24 '25 23:11 williamcaban

why do we need to maintain a mapping from prompt id to mlflow prompt name?

The idea is to distinguish Llama Stack-managed prompts from other prompts that might exist in the same MLflow registry.

when would a deployer want to set use_metadata_tags=False? can we always use metadata and skip the id/name translations?

mattf avatar Nov 25 '25 15:11 mattf

why do we need to maintain a mapping from prompt id to mlflow prompt name?

The idea is to distinguish Llama Stack-managed prompts from other prompts that might exist in the same MLflow registry.

when would a deployer want to set use_metadata_tags=False? can we always use metadata and skip the id/name translations?

We can remove that option because, in practice, setting it to false has significant downsides. Let me remove the option.

williamcaban avatar Nov 26 '25 01:11 williamcaban

why do we need to maintain a mapping from prompt id to mlflow prompt name?

The idea is to distinguish Llama Stack-managed prompts from other prompts that might exist in the same MLflow registry.

when would a deployer want to set use_metadata_tags=False? can we always use metadata and skip the id/name translations?

We can remove that option because, in practice, setting it to false has significant downsides. Let me remove the option.

thanks. do we still need to have the id mapping?

mattf avatar Nov 26 '25 11:11 mattf

@franciscojavierarceo @mattf I have addressed all the comments. Is there anything else you consider should be addressed for approval?

williamcaban avatar Dec 08 '25 16:12 williamcaban

This pull request has merge conflicts that must be resolved before it can be merged. @williamcaban please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

mergify[bot] avatar Dec 14 '25 12:12 mergify[bot]