kuadrant-operator icon indicating copy to clipboard operation
kuadrant-operator copied to clipboard

Investigate MCP (Model Context Protocol) and how it can relate to Kuadrant policies

Open david-martin opened this issue 11 months ago • 3 comments

MCP is a protocol that enables integration between LLM applications and external data sources and tools. For example, searching the web to get additional context for an LLM chat response.

Let's investigate where in the flow of requests to an AI Gateway that MCP can come into play, and how Kuadrant policies could potentially play a part there. Given initial understanding, this may be in an 'east/west' capacity or via egress gateways as requests are made, using MCP, to applications and services outside the model server.

david-martin avatar Apr 10 '25 14:04 david-martin

What is MCP?

MCP also known as Model context protocol is an open protocol standardising how applications provide context to their LLM.

MCP Architecture

MCP at its foundations is a client-server architecture consisting of:

MCP Host: LLM applications initiating connections

MCP Clients: Inside the host application maintain 1:1 connections with servers

MCP Server: Context, tools and prompts for clients

Core components:

Protocol Layer: ​​Handles message framing, request/response linking, and high-level communication patterns.

Transport Layer: Handles communication between clients and servers

Multiple supported transport mechanisms

Stdio transport: Enables communication through standard input & output streams Used for: Building command-line tools Implementing local integrations Needing simple process communication Working with shell scripts

HTTP with SSE transport (Server side events): (From starting this POC to finishing it this method is not obsolete Steamable HTTP is the new method) SSE Enables server-to-client streaming. For client-to-server communication, HTTP Post requests are used.

Steamable HTTP:

  • The server operates as an independent process that can handle multiple client connections
  • Uses HTTP POST and GET requests
  • Optionally leveraging Server-Sent Events (SSE)
  • Server MUST provide a single HTTP endpoint path aka MCP endpoint

Custom transports: implement custom transports for specific needs but need to conform to the transport interface. You can implement custom transports for:

  • Custom network protocols
  • Specialised communication channels
  • Integration with existing systems
  • Performance optimisation

MCP server capabilities:

Resources: Expose data and content from a MCP servers to LLMs e.g

  • File contents
  • Database records
  • API responses
  • Live system data
  • Screenshots and images
  • Log files

Tools: Allows LLMs to perform actions through a MCP server. Tools expose executable functionality to clients.

Sampling: Enables servers to request LLM completions through the client, allowing sophisticated agentic behaviours.

Message formats are in JSON-RPC

How does MCP work in the AI gateway architecture?

In terms of AI Gateway MCP could be useful as the standard protocol for the area of “advanced inference-specific capabilities”. Specifically, MCP could work for the LLMs implementing RAG or other supplementary information retriving via tools or resources, e.g. external DB. Can Kuadrant be used with MCP? MCP mentions in multiple areas that security considerations are needed when using MCP areas like authentication/authorisation (e.g DNS rebinding attack prevention ) and rate limiting https://modelcontextprotocol.io/docs/concepts/transports#security-considerations. These areas would be specifically for Streamable HTTP transport or custom transport depending on the type e.g gRPC

POC findings:

When trying to see if its possible for Kuadrant to work with MCP for areas mentioned above a few issues arose mainly due to the quickly changing infancy of the project. These include:

  1. MCP is currently mainly run locally this includes MCP Hosts, Clients and Servers. Remote servers is a new idea with very little if any documentation on this. I was able to get a MCP server running on a K8s cluster but clients couldn't connect.
  2. As local is the main method currently STDIO is the main transport that most clients accept like Claude desktop. The number of clients that accept at the time SSE is very low, with even fewer accepting Streamable HTTP.

Clients i did try connect to my server that accept SSE (with some caveats), VS code (only tooling is accepted) and a test client called apify; CLaude desktop was also tried but doesn't accept SSE only STDIO (not very well documented) none of these were actually able to connect for reasons I couldn't figure out Error logs from said clients are non existent.

Things that were tried:

  • HTTP Unsecure endpoint
  • HTTPS Secure endpoint self signed
  • HTTPS secure endpoint publicly secure
  • K8s kind cluster
  • Openshift cluster

R-Lawton avatar May 02 '25 11:05 R-Lawton

@R-Lawton thanks for looking into this. Though you hit issues getting an SSE MCP server running (which was unfortunate), I don't see any obvious technical reason why having them run behind a gateway shouldn't work. I do believ there is a story to be told around the value of Kuadrant policies when running MCP servers that are accessed over the network (not local stdio which seems to be very common right now).

You may be interested in exploring Solos Agent Gateway. My understanding is that a 'Target' abstracts an MCP server that could actually be using stdio. Then a 'Listener' is defined in the Gateway as listening on some port and 'targets' 1 or more 'Targets', acting as a bridge from the network to stdio.

Though it's not something we're exploring right now, I'd be interested in your take on this approach.

david-martin avatar May 20 '25 13:05 david-martin

I think there's probably some other investigations here we want to pursue when we have time:

  • Agentic use-cases and crossover with existing policies, or new ones. Helping transform or expose older legacy APIs and services into MCP servers, that sort of thing
  • Policies that intersect MCP re: egress, layering on auth for easy connections etc. Egress is a domain we don't really touch on yet with our connectivity story. Auth for MCP is a general pain point as I understand it, and one where policies may help rather than hinder.

@maleck13 likely has some thoughts here as well

jasonmadigan avatar May 22 '25 14:05 jasonmadigan

I think you have it covered. I am interested to learn more about an agent gateway and how we might be able to help there particularly.

  1. What is the difference between an agentic gateway such as https://github.com/agentgateway/agentgateway?tab=readme-ov-file and an MCP Proxy/Gateway such as https://github.com/TBXark/mcp-proxy (based on https://github.com/mark3labs/mcp-go/blob/main/README.md)
  2. Could some of this functionality be added via WASM or does it need it's own service?
  3. Are there any decisions and configuration that could be valuable to apply to a MCP proxy via some form of policy

maleck13 avatar May 26 '25 06:05 maleck13

With the new Streamable HTTP from MCP the POC was successful Demo :https://github.com/user-attachments/assets/7c27b52a-27a4-4483-83be-7da1dbfbeb7e

Repo: https://github.com/Kuadrant/kuadrant-mcp-poc

R-Lawton avatar Jun 10 '25 14:06 R-Lawton