mcp-context-forge icon indicating copy to clipboard operation
mcp-context-forge copied to clipboard

[Epic] AI Service Discovery and Gateway Proxy

Open jonpspri opened this issue 1 month ago • 0 comments

🧭 Type of Feature

Please select the most appropriate category:

  • [x] Enhancement to existing functionality
  • [x] New feature or capability
  • [ ] New MCP-compliant server
  • [ ] New component or integration
  • [ ] Developer tooling or test improvement
  • [ ] Packaging, automation and deployment (ex: pypi, docker, quay.io, kubernetes, terraform)
  • [ ] Other (please describe below)

🧭 Epic

Title: AI Service Discovery and Gateway Proxy

Goal: Enable consumer AI agents to discover available services through natural language search, select specific services, and interact with them via the gateway acting as an intelligent proxy—all without requiring user or administrator intervention. The gateway becomes its own MCP client, performing RPC calls on behalf of the AI agent.

Why now:

  • AI agents need autonomous service discovery and invocation capabilities
  • Manual configuration and intervention creates friction in AI workflows
  • Semantic search enables intelligent service matching rather than exact-match lookups
  • Gateway-as-proxy pattern enables dynamic service composition
  • Critical for production AI applications that need to discover and consume services at runtime

🙋♂️ User Story 1: AI Agent Service Discovery

As a: Consumer AI agent (Claude, GPT, Gemini, autonomous agent) I want: To search for available services using natural language queries So that: I can discover relevant services without knowing exact service names or requiring manual configuration

✅ Acceptance Criteria

Scenario: AI discovers services via semantic search
  Given the gateway hosts multiple service providers with documented capabilities
  When the AI agent searches for "convert image to PDF"
  Then it receives a ranked list of relevant services
  And each result includes service description, capabilities, and invocation patterns
  And the AI can select the most appropriate service for its task

Scenario: AI refines search results
  Given the AI received multiple matching services
  When the AI narrows search with "secure image to PDF with watermark"
  Then the results are filtered to services supporting those specific features
  And services are ranked by relevance using vector similarity
  And the AI gets documentation explaining each service's parameters

🙋♂️ User Story 2: Gateway Proxy for Service Invocation

As a: Consumer AI agent I want: The gateway to act as my proxy for invoking backend services So that: I can interact with services without managing individual connections, authentication, or protocol details

✅ Acceptance Criteria

Scenario: AI invokes service through gateway proxy
  Given the AI has discovered a service "pdf_converter"
  When the AI requests "proxy call to pdf_converter with image_url and watermark parameters"
  Then the gateway authenticates on behalf of the AI
  And the gateway translates the request to the service's native protocol
  And the gateway forwards the request to the backend service
  And the gateway returns the service response to the AI

Scenario: Gateway handles authentication transparently
  Given a backend service requires API key authentication
  When the AI invokes the service through the gateway proxy
  Then the gateway injects the appropriate credentials
  And the AI does not need to manage or store service credentials
  And the gateway enforces authorization policies (RBAC, rate limits)
  And the gateway logs the transaction for audit purposes

🙋♂️ User Story 3: Administrator Service Curation

As a: Gateway administrator I want: To curate and document available services with semantic metadata So that: AI agents can discover and understand services through natural language queries

✅ Acceptance Criteria

Scenario: Administrator configures semantic service metadata
  Given I have registered an MCP server as a service provider
  When I add semantic metadata (description, capabilities, use cases, tags)
  Then the service becomes discoverable via vector search
  And the metadata is indexed for similarity matching
  And AI agents receive rich context about service capabilities

Scenario: Administrator manages service documentation
  Given multiple services provide similar capabilities
  When I update service descriptions to highlight differentiators
  Then AI search results reflect the updated information
  And AI agents receive guidance on when to use each service
  And administrators can monitor which services are discovered most often

🙋♂️ User Story 4: Security and Authorization

As a: Security administrator I want: The gateway to enforce authentication and authorization for proxy requests So that: AI agents can only access services they're permitted to use, with full audit trails

✅ Acceptance Criteria

Scenario: Gateway enforces API key authorization
  Given a backend service requires an API key
  When an AI agent attempts to invoke the service via proxy
  Then the gateway validates the AI's authorization to use that service
  And the gateway injects the appropriate API key from secure storage
  And the request is logged with AI identity, service, and parameters

Scenario: Gateway supports OAuth bearer token flow
  Given a backend service uses OAuth 2.0 bearer tokens
  When an AI agent invokes the service
  Then the gateway obtains a valid bearer token (cached or refreshed)
  And the gateway includes the token in the proxied request
  And the gateway handles token refresh transparently

Scenario: Gateway prevents unauthorized access
  Given an AI agent is not authorized for a specific service
  When the AI attempts to invoke that service
  Then the gateway rejects the request with a 403 Forbidden
  And the rejection is logged for security audit
  And the AI receives a clear error message explaining the denial

📐 Design Sketch

sequenceDiagram
    participant AI as Consumer AI
    participant GW as MCP Gateway<br/>(Discovery Service)
    participant VS as Vector Search<br/>(Semantic Index)
    participant Proxy as Gateway Proxy<br/>(MCP Client)
    participant Svc as Service Provider<br/>(Backend MCP Server)

    Note over AI,GW: Discovery Phase
    AI->>GW: Search: "convert documents to PDF"
    GW->>VS: Vector similarity search
    VS-->>GW: Ranked service matches
    GW-->>AI: Service list with metadata

    Note over AI,Proxy: Selection Phase
    AI->>GW: Select service "pdf_converter"
    GW-->>AI: Service capabilities & parameters

    Note over AI,Svc: Invocation Phase (Proxy Pattern)
    AI->>Proxy: Proxy call: pdf_converter(image_url, options)
    Proxy->>Proxy: Validate authorization<br/>Retrieve credentials
    Proxy->>Svc: RPC call with authenticated request
    Svc-->>Proxy: PDF result
    Proxy->>Proxy: Log transaction
    Proxy-->>AI: PDF result

    Note over GW,Svc: Gateway acts as MCP Client

📐 Discovery API Examples

Search Request

POST /services/search
{
  "query": "convert images to PDF with watermarks",
  "max_results": 10,
  "filters": {
    "capabilities": ["pdf_generation", "image_processing"]
  }
}

Search Response

{
  "results": [
    {
      "service_id": "pdf_converter",
      "name": "Advanced PDF Converter",
      "description": "Convert images and documents to PDF with watermarking, encryption, and compression",
      "relevance_score": 0.92,
      "capabilities": ["pdf_generation", "image_processing", "watermarking", "encryption"],
      "provider": "mcp://pdf-service.example.com",
      "parameters": {
        "input_url": "URL to source image/document",
        "watermark_text": "Optional watermark text",
        "compression_level": "Optional: low, medium, high"
      },
      "authentication": "api_key"
    },
    {
      "service_id": "simple_pdf_tool",
      "name": "Simple PDF Generator",
      "description": "Basic image to PDF conversion",
      "relevance_score": 0.78,
      "capabilities": ["pdf_generation", "image_processing"],
      "provider": "mcp://simple-pdf.example.com",
      "parameters": {
        "image_url": "URL to image file"
      },
      "authentication": "none"
    }
  ],
  "query_interpretation": "Services that can convert images to PDF format and support watermarking features"
}

Proxy Invocation Request

POST /services/proxy
{
  "service_id": "pdf_converter",
  "method": "tools/call",
  "params": {
    "name": "convert_to_pdf",
    "arguments": {
      "input_url": "https://example.com/document.png",
      "watermark_text": "CONFIDENTIAL",
      "compression_level": "high"
    }
  }
}

Proxy Response

{
  "result": {
    "pdf_url": "https://gateway.example.com/results/document-abc123.pdf",
    "file_size": 245678,
    "pages": 1,
    "processing_time_ms": 1250
  },
  "metadata": {
    "service_id": "pdf_converter",
    "provider": "mcp://pdf-service.example.com",
    "authenticated_as": "gateway_service_account",
    "transaction_id": "txn_789xyz"
  }
}

🔗 MCP Standards Check

  • [x] Change adheres to current MCP specifications
  • [ ] No breaking changes to existing MCP-compliant integrations
  • [ ] If deviations exist, please describe them below:

Standards Alignment:

  • Uses standard MCP protocol for backend service communication
  • Gateway acts as both MCP server (for AI agents) and MCP client (for service providers)
  • Discovery service is an extension layer on top of standard MCP primitives
  • Proxy pattern maintains full MCP protocol compliance for backend calls

Potential Concerns:

  • Semantic search is not standardized in MCP protocol (implementation-specific)
  • Proxy invocation may require extensions to standard MCP client behavior
  • Authentication credential management needs careful design for security

🔄 Alternatives Considered

  1. Direct AI-to-Service Connections

    • Rejected: Requires AI to manage multiple connections, credentials, and protocols
    • Forces AI to handle service-specific authentication mechanisms
    • No centralized audit trail or rate limiting
    • AI must know exact service endpoints in advance
  2. Manual Service Registration by AI

    • Rejected: Requires pre-configuration and manual endpoint management
    • Doesn't support dynamic discovery based on capabilities
    • AI must maintain service catalog state
    • No semantic search capabilities
  3. GraphQL Federation Pattern

    • Considered: Could provide unified schema across services
    • Limitation: Requires all services to expose GraphQL schemas
    • Complexity: Adds additional protocol translation layer
    • Decision: MCP protocol already provides necessary abstractions
  4. Exact-Match Service Directory (No Semantic Search)

    • Rejected: AI must know exact service names
    • No fuzzy matching or capability-based discovery
    • Brittle - requires precise query syntax
    • Doesn't leverage AI's natural language understanding
  5. AI Direct RPC (No Gateway Proxy)

    • Rejected: AI must manage per-service authentication
    • No centralized authorization or rate limiting
    • Difficult to audit multi-service workflows
    • AI must handle service-specific error handling

📓 Additional Context

Implementation Components:

  1. Semantic Search Service

    • Vector embedding generation for service descriptions
    • Similarity search using vector database (pgvector, Qdrant, Weaviate)
    • LLM-powered query understanding and result curation
    • Anti-gaming measures (rate limiting, quality scoring)
  2. Gateway Proxy Layer

    • MCP client implementation within gateway
    • Connection pooling to backend services
    • Credential injection from secure vault
    • Request/response translation if needed
  3. Service Metadata Schema

    • Rich semantic descriptions (capabilities, use cases, examples)
    • Parameter schemas and validation rules
    • Authentication requirements
    • Rate limits and quotas
  4. Security Infrastructure

    • API key management and rotation
    • OAuth 2.0 token acquisition and caching
    • RBAC policies for service access
    • Audit logging for compliance

Search Optimization Concerns:

The Epic identifies that service providers will likely attempt to game the semantic search system (SEO-like behavior). Mitigation strategies:

  • Quality Scoring: Implement reputation and quality metrics
  • Manual Curation: Allow administrators to pin/boost trusted services
  • Usage Analytics: Rank services by successful completion rate
  • Abuse Detection: Flag services with suspicious metadata patterns
  • Rate Limiting: Prevent metadata spam and manipulation

Configuration Variables (new):

# Discovery Service
MCPGATEWAY_DISCOVERY_ENABLED=true
MCPGATEWAY_DISCOVERY_VECTOR_DB_URL=postgresql://localhost/vectordb
MCPGATEWAY_DISCOVERY_EMBEDDING_MODEL=text-embedding-3-small
MCPGATEWAY_DISCOVERY_SEARCH_RESULTS_LIMIT=20

# Proxy Service
MCPGATEWAY_PROXY_ENABLED=true
MCPGATEWAY_PROXY_MAX_CONNECTIONS_PER_SERVICE=10
MCPGATEWAY_PROXY_REQUEST_TIMEOUT=30
MCPGATEWAY_PROXY_CREDENTIAL_VAULT_URL=vault://localhost:8200

# Security
MCPGATEWAY_PROXY_AUTHORIZATION_REQUIRED=true
MCPGATEWAY_PROXY_AUDIT_LOG_ENABLED=true

Documentation Requirements:

The Epic emphasizes that "documentation of the MCP Proxy Service (and discover service) will need to be carefully crafted to explain to the requesting AI just what's going on."

Recommended documentation:

  • /services/search endpoint behavior and query syntax
  • /services/proxy invocation patterns and parameters
  • Authentication flow explanations for AI agents
  • Service metadata schema and best practices
  • Example workflows for common discovery→invocation patterns
  • Error handling and retry guidance

Related Existing Features:

  • Gateway federation (ADR-008) - provides multi-gateway coordination
  • Virtual servers - composition of multiple MCP servers
  • Tool registry - existing tool discovery mechanism
  • RBAC system - authorization framework to extend

Related Issues:

  • SEP-1649 implementation (#1304) - provides .well-known/mcp.json discovery
  • OAuth support - needed for bearer token proxy authentication
  • Plugin framework - could host semantic search as a plugin

Testing Strategy:

  • Unit tests: Vector search ranking, credential injection
  • Integration tests: End-to-end discovery→proxy→invocation flow
  • Security tests: Unauthorized access prevention, credential isolation
  • Performance tests: Search latency, concurrent proxy requests
  • Gaming tests: Attempt to manipulate search rankings via metadata spam

Phased Implementation:

Phase 1: Basic Proxy

  • Gateway-as-MCP-client functionality
  • Manual service selection (no search)
  • API key authentication only

Phase 2: Semantic Search

  • Vector embedding generation
  • Similarity search implementation
  • Service metadata management UI

Phase 3: Advanced Authentication

  • OAuth bearer token support
  • Credential vault integration
  • Token caching and refresh

Phase 4: Security Hardening

  • RBAC integration for service access
  • Comprehensive audit logging
  • Anti-gaming measures for search

jonpspri avatar Oct 20 '25 10:10 jonpspri