Idea: Introduce MCPGateway CRD for pluggable gateway implementations
Problem Statement
Users can already deploy alternative MCP gateway solutions (like kagenti/mcp-gateway) alongside ToolHive, and Virtual MCP is entirely optional. However, our current CRDs (VirtualMCPServer, MCPGroup) are not as abstract and reusable as they could be for enabling a true plug-and-play ecosystem.
While different gateway implementations can coexist today, they don't share common abstractions. Each gateway solution must define its own resources and discovery mechanisms, missing the opportunity for interoperability that patterns like Kubernetes Gateway API and Ingress provide.
Current State
What works today:
- ✅ Users can deploy any MCP gateway (Virtual MCP, kagenti/mcp-gateway, custom solutions)
- ✅ Virtual MCP is completely optional
- ✅
MCPGroupprovides backend grouping and discovery - ✅ Multiple gateway types can run in the same cluster
What could be better:
- ❌ No standardized gateway abstraction
- ❌ Each implementation defines its own CRD structure
- ❌ Limited interoperability between gateway implementations
- ❌
VirtualMCPServeris tightly coupled to our specific implementation - ❌ Switching gateway implementations requires resource rewrites
Current Architecture
VirtualMCPServer (our implementation):
- References
MCPGroupto discover backend workloads viaListWorkloadsInGroup() - Handles aggregation, conflict resolution, composite tools
- Manages authentication (incoming/outgoing)
- Deploys as a Deployment with Service
- Implementation-specific configuration mixed with common gateway concerns
MCPGroup (shared resource):
- Simple grouping mechanism with label selectors
- Lists member
MCPServerworkloads - Provides backend discovery interface
- Already useful across different gateway implementations
Inspiration from Kubernetes Patterns
Gateway API Pattern
- GatewayClass: Defines configuration template, handled by specific controller
- Gateway: Instance of a class, requests traffic translation
- Routes: Define how traffic maps to services
- Multiple implementations can coexist (Istio, NGINX, Envoy)
Ingress Pattern
- IngressClass: Specifies which controller handles the Ingress
- Ingress: Defines traffic rules
- Controllers claim responsibility via
ingressClassName - Multiple controllers coexist peacefully
kagenti/mcp-gateway
- Envoy-based aggregation with ext_proc filter
- Supports both standalone (file config) and Kubernetes (CRD) modes
- Integrates with Gateway API HTTPRoute
- Emphasizes policy flexibility through Envoy filters
- Uses tool prefixes for conflict resolution
Potential Vision: MCPGateway + MCPGatewayClass
Similar to Gateway API, we could introduce standardized abstractions:
MCPGatewayClass
Defines the gateway implementation and its configuration contract:
- Controller name (e.g.,
toolhive.stacklok.dev/virtual-mcp,kagenti.io/envoy-gateway) - Implementation-specific parameters
- Shared capabilities and defaults
MCPGateway
User-facing resource that:
- References an
MCPGatewayClass - References
MCPGroupfor backend discovery - Defines common gateway concerns (service type, listeners)
- Includes implementation-specific configuration
Benefits
- Reusability: Common abstractions reduce duplication across gateway implementations
- Interoperability: Standard backend discovery via
MCPGroupworks for any gateway - Flexibility: Users can choose gateway implementations based on their needs
- Ecosystem Growth: Easier for community to build compatible gateway solutions
- Migration: Users can switch gateway implementations with minimal config changes
- Coexistence: Multiple gateway types already work, but with better shared patterns
Open Questions
- Is this abstraction worth pursuing? Do we see enough interest in multiple gateway implementations to justify the standardization effort?
- What belongs in the common spec? What configuration should be standardized vs implementation-specific?
- Backend discovery interface: Should
MCPGroupremain the primary discovery mechanism? Should gateways support other patterns? - Migration strategy: How do we handle existing
VirtualMCPServerresources? - Adoption path: Would gateway implementers (including kagenti/mcp-gateway) benefit from and adopt these abstractions?
- Scope of standardization: How much should we standardize vs let implementations differ?
Next Steps
This is an exploratory idea to gather feedback on whether:
- The abstraction provides meaningful value over current flexibility
- The Gateway API pattern fits MCP gateway use cases
- There's community/user interest in standardized gateway abstractions
- The complexity is worth the improved interoperability
We'd love input from users who might want to:
- Use alternative gateway implementations
- Build custom MCP gateways
- Switch between gateway implementations more easily
- Integrate with existing service mesh or API gateway infrastructure
Related Resources
- Current VirtualMCPServer:
cmd/thv-operator/api/v1alpha1/virtualmcpserver_types.go - MCPGroup:
cmd/thv-operator/api/v1alpha1/mcpgroup_types.go - Backend discovery:
pkg/vmcp/aggregator/discoverer.go - Virtual MCP proposal:
docs/proposals/THV-2106-virtual-mcp-server.md - Kubernetes Gateway API: https://gateway-api.sigs.k8s.io/
- kagenti/mcp-gateway: https://github.com/kagenti/mcp-gateway