MLflow Prompt Registry Provider

Summary

This PR adds a new remote MLflow provider for the Prompts API, enabling centralized prompt management and versioning using MLflow's Prompt Registry (MLflow 3.4+).

What's New

Remote Provider: `remote::mlflow`

A production-ready provider that integrates Llama Stack's Prompts API with MLflow's centralized prompt registry, supporting:

Version Control: Immutable prompt versioning with full history
Default Version Management: Easy version switching via aliases
Auto Variable Extraction: Automatic detection of {{ variable }} placeholders
Centralized Storage: Team collaboration via shared MLflow server
Metadata Preservation: Llama Stack metadata stored as MLflow tags

Quick Start

1. Configure Llama Stack

Basic configuration with SQLite (default):

prompts:
  - provider_id: reference-prompts
    provider_type: inline::reference
    config:
      run_config:
        storage:
          stores:
            prompts:
              type: sqlite
              db_path: ./prompts.db

With PostgreSQL:

prompts:
  - provider_id: postgres-prompts
    provider_type: inline::reference
    config:
      run_config:
        storage:
          stores:
            prompts:
              type: postgres
              url: postgresql://user:pass@localhost/llama_stack

2. Use the Prompts API

from llama_stack_client import LlamaStackClient

client = LlamaStackClient(base_url="http://localhost:5000")

# Create a prompt
prompt = client.prompts.create(
    prompt="Summarize the following text in {{ num_sentences }} sentences:\n\n{{ text }}",
    variables=["num_sentences", "text"]
)
print(f"Created prompt: {prompt.prompt_id} (v{prompt.version})")

# Retrieve prompt
retrieved = client.prompts.get(prompt_id=prompt.prompt_id)
print(f"Retrieved: {retrieved.prompt}")

# Update prompt (creates version 2)
updated = client.prompts.update(
    prompt_id=prompt.prompt_id,
    prompt="Summarize in exactly {{ num_sentences }} sentences:\n\n{{ text }}",
    version=1,
    set_as_default=True
)
print(f"Updated to version: {updated.version}")

# List all prompts
prompts = client.prompts.list()
print(f"Found {len(prompts.data)} prompts")

# Delete prompt
client.prompts.delete(prompt_id=prompt.prompt_id)

Nov 16 '25 19:11 williamcaban

PR is now ready for review and includes the following updates:

Moved the previous prompts.py as an inline provider (see inline_reference.mdx for details)
Defined a remote provider with MLflow supporting authentication (see remote_mlflow.mdx for details)
Removed any dependencies on prompt caching

Nov 24 '25 02:11 williamcaban

@mattf

why do we need to maintain a mapping from prompt id to mlflow prompt name?

The idea is to distinguish Llama Stack-managed prompts from other prompts that might exist in the same MLflow registry.

Nov 24 '25 23:11 williamcaban

why do we need to maintain a mapping from prompt id to mlflow prompt name?

The idea is to distinguish Llama Stack-managed prompts from other prompts that might exist in the same MLflow registry.

when would a deployer want to set use_metadata_tags=False? can we always use metadata and skip the id/name translations?

Nov 25 '25 15:11 mattf

why do we need to maintain a mapping from prompt id to mlflow prompt name?

The idea is to distinguish Llama Stack-managed prompts from other prompts that might exist in the same MLflow registry.

when would a deployer want to set use_metadata_tags=False? can we always use metadata and skip the id/name translations?

We can remove that option because, in practice, setting it to false has significant downsides. Let me remove the option.

Nov 26 '25 01:11 williamcaban

why do we need to maintain a mapping from prompt id to mlflow prompt name?

The idea is to distinguish Llama Stack-managed prompts from other prompts that might exist in the same MLflow registry.

when would a deployer want to set use_metadata_tags=False? can we always use metadata and skip the id/name translations?

We can remove that option because, in practice, setting it to false has significant downsides. Let me remove the option.

thanks. do we still need to have the id mapping?

Nov 26 '25 11:11 mattf

@franciscojavierarceo @mattf I have addressed all the comments. Is there anything else you consider should be addressed for approval?

Dec 08 '25 16:12 williamcaban

This pull request has merge conflicts that must be resolved before it can be merged. @williamcaban please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Dec 14 '25 12:12 mergify[bot]

feat: Add MLflow Prompt Registry provider

MLflow Prompt Registry Provider

Summary

What's New

Remote Provider: remote::mlflow

Quick Start

1. Configure Llama Stack

2. Use the Prompts API

Remote Provider: `remote::mlflow`