llama-stack
llama-stack copied to clipboard
Hardening Stack registry & admin APIs for multi-tenant SaaS
Issue #3809 highlighted the need for runtime provider management and surfaced the two deployment modes we must support:
- GitOps/PaaS: each tenant runs its own stack instance; mutations should be optional/disabled.
- Shared SaaS: multiple tenants share a control plane; mutations must be fully auditable, lock-safe, and isolated.
Detailed breakdown
Registry redesign
- Move Registry to a Relational Design so that access attributes can be a separate column and one can employ stricter rules. Right now, the design mixes everything into a Value blob (in the KV design) which is dangerous.
- Make a migration script so that existing KV-backed registries can be migrated to the relational (sqlstore) design.
/admin APIs
- add roles and policies addressing questions such has: is there an "Administrator" role? who can create new tenants? change providers? allow "no one" as a policy which effectively disables
/adminAPIs. these policies are specified in run.yaml (or as another file referenced from there) - make current { register, unregister } work via /admin
- allow creating new provider instances via
/admin/providers/namespace enabling use cases like "openai with api_key=foo", "vllm pointing to host X". A
Clarify SQLite database backend limitations re: multi-workers and multi-replicas
- Technically not necessary to be part of the grand SaaS plan
- Make sure our documentation clearly says that this is for prototyping and you should use Postgres for scaling.
- llama stack run should fail if you specify multiple workers with sqlite
Create a notion of a tenant. Design that in terms of generic ABAC.
- CRUD for ABAC attributes: creation of { organization, namespaces, projects, etc. }
- Who can create tenants? Roles are assigned by the IdP, stack does enforcement based on the claims in the JWT
Generalize Quota management
- Resource Usage to record { resource_type, increment, time, cost_center } after each request
- Querying of usage roll-ups for checking can be punted to another system (may need specialized db systems) but simple implementations can be provided against Redis (e.g.) in the Stack
Make Audit logs a first-class thing
- Define schema for the table
- Define which requests should be audited
- Add /admin APIs for pulling them
Release plan
- 0.4.0 – includes: land the new SQL-backed registry + schema,
/adminnamespace, design, policies and CRUD for ABAC attributes, migration tooling. - 0.5.0 – flip the registry to SQL by default, /admin/providers, Quota management and Audit logs.
- Post-0.5.0 – deprecate the KV backend entirely and consider turning the new guardrails on by default for SaaS profiles.
Detailed task breakdowns for each release live in the gist comments (linked above) and will be turned into sub‑issues.
Supersedes https://github.com/llamastack/llama-stack/issues/3809; closing that issue and tracking all follow-on work here.
cc @leseb @cdoern @bbrowning @rhuss @mattf @raghotham