envoy icon indicating copy to clipboard operation
envoy copied to clipboard

Proposal: Envoy Support for Model Context Protocol (MCP)

Open botengyao opened this issue 8 months ago • 24 comments

Objective

We’d like to initiate an issue and discussion around supporting the Model Context Protocol (MCP) in Envoy as a gateway.

What is MCP?

MCP is an open, stateless/stateful protocol that allows GenAI applications to retrieve and exchange context (e.g. source code, files, documents) with LLMs, using JSON-RPC semantics. A significant MCP streamable HTTP update last month, introduces OAuth 2.1-based authorization, streamable HTTP transport, JSON-RPC batching, and tool annotations. This is a major update that can make MCP work as a remote server.

Details

MCP can use transports such as stdio or Streamable HTTP. With the streamable HTTP update, the bidirectional JSON-RPC messages can be exchanged over HTTP POST and GET. And SSE is wrapped into the streamable HTTP. The following diagrams show the HTTP transport and capacity negotiation in MCP:


Transport

Capacity Negotiation

This proposal explores how Envoy can serve as a gateway between MCP clients and servers — helping route, process, and secure MCP messages in a scalable and extensible way.

Design Proposal

With MCP gaining traction as a standard way for AI tools to interact with contextual data, we believe Envoy can play an important role in enabling infrastructure-level routing, load balancing, and observability for these interactions.

This issue proposes a set of functions that enables Envoy to act as a gateway between MCP clients and servers, covering the following use cases, in order of their complexity and implementation:

Proposed Functionality

  1. MCP session aware load balancing based on the MCP endpoint (HTTP request URI).

    • This work is already in progress in Envoy https://github.com/envoyproxy/envoy/pull/39004, thanks to @wbpcode
  2. Parsing of MCP protocol to make Envoy aware of MCP request properties such as method/id, call arguments, or return values.

  3. Authentication of MCP requests using O-Auth2, JWT or API keys based on MCP request properties.

  4. Authorization of MCP requests and messages using RBAC (for example authorizing specific MCP methods based on the caller identity). This authorization will apply to both client and server requests.

  5. Transcoding JSON-RPC messages to existing API surfaces, for example gRPC and OpenAPI.

  6. Rate limiting of MCP requests.

  7. Customizable business logic for MCP messages, similar to HTTP filters, including remote callouts for MCP messages.

  8. Load balancing and fanning-out of individual MCP messages (i.e. based on method) from a single HTTP stream.

  9. Gateway initialized SSE stream support with session resumption and JSON-RPC batch support.

Note

MCP is closely related to the A2A protocol that is proposed for agent to agent communications. Both protocols use JSON-RPC and streaming semantics and stateful sessions. While this proposal is covering MCP, it does not preclude extending the same functions to A2A protocol. While some of the functions are agnostic of the underlying protocol, business logic specific to A2A can be implemented in its own extension, sharing common implementation, such as JSON-RPC parser and framing, with MCP.

Acknowledgments

This proposal was framed collaboratively with @htuch, @yanavlasov, and @botengyao. This issue is intended to surface the proposal for public discussion, gather feedback, and coordinate OSS collaboration.

We welcome thoughts, feedback, and ideas from the community as we continue to iterate on this direction.

botengyao avatar Apr 18 '25 17:04 botengyao

Seems this proposal contains lots of features. What will be the initial target or is there a roadmap? Now, I am instersted in the 1 and 5. I think 5 should be a tranditial HTTP filter which will respond the tools/resources/prompts list request and then convert the related json rpc request based on the content to tranditional HTTP call.

wbpcode avatar Apr 20 '25 08:04 wbpcode

+1. Interested to support this effort. Is 2 related to json-rpc parsing and routing primitives based on json-rpc ?

ramaraochavali avatar Apr 20 '25 10:04 ramaraochavali

Now, I am instersted in the 1 and 5. I think 5 should be a tranditial HTTP filter which will respond the tools/resources/prompts list request and then convert the related json rpc request based on the content to tranditional HTTP call.

Thanks @wbpcode. Right, 5 with the transcoder support to REST/gRPC backends could be a great use case, and it will combine with 2 with the parser of MCP protocol.

One traditional filter is reasonable if we don't consider the batch of JSON-RPC, which was added as a MCP RFC ~3 weeks ago. I don't see much usage as of now with JSON-RPC batch, but as an incremental effort, we still want to keep the door open for this 1:N and N:1 fan out support in Envoy. There are several options though, like a terminal filter can achieve this.

There are some other use cases that we need to consider like SSE with the JSON-RPC batch, which adds more complexity.

Re road map

I believe we have the idea, and want to discuss more usage pattern in Envoy with the community.

@kyessenov mentioned a very useful case in the maintainer channel, which I think is a great case study for how MCP is supported in Envoy. Let's say we want the LLM to manage each Envoy instance’s counter status and configuration dump status, adjust log levels, and read aggregated metrics and logs. The goal is to use these capabilities to help debug issues and gain operational insights. How would we design this service considering the scalability?

We have two fundamental approaches: (1) add native MCP support directly to Envoy's admin APIs, or (2) use an MCP adapter/transcoder to connect to Envoy's existing REST admin APIs. One architecture that leverages Envoy's strengths while providing elegant scalability for large fleets can be the following:

graph TD
    LLM[LLM Client] -- MCP --> LB[Envoy MCP API Gateway]
    LB -- MCP /cluster-a/* --> MCPS1[MCP Envoy with transcoder - cluster-a]
    LB -- MCP /cluster-b/* --> MCPS2[MCP Envoy with transcoder - cluster-b]
    LB -- MCP /global/* --> MCPS3[MCP Envoy with transcoder - global-cluster]
    
    MCPS1 -- REST API --> ENV_A1[Envoy A1 Admin API]
    MCPS1 -- REST API --> ENV_A2[Envoy A2 Admin API]
    
    MCPS2 -- REST API --> ENV_B1[Envoy B1 Admin API]
    MCPS2 -- REST API --> ENV_B2[Envoy B2 Admin API]
    
    MCPS3 -- REST API --> ENV_ALL[Global Envoy Metric Service APIs]
    MCPS3 -- Query/Store --> DB

What I find elegant about this example is that it's "Envoy all the way down" - using Envoy's own capabilities to solve the MCP integration challenge. The front-end Gateway routes MCP requests based on JSON-RPC methods / resouces uri / tool group usage, the second layer contains the specialized transcoders that serve different clusters with the map between MCP schema and REST APIs, and we maintain a clean separation between individual and aggregated metrics.

This includes 1, 2, 5, and potentially 8 and SSE stream support.

botengyao avatar Apr 21 '25 08:04 botengyao

I think Envoy's debug use case would be a good one to solve.

ramaraochavali avatar Apr 21 '25 10:04 ramaraochavali

I think Envoy's debug use case would be a good one to solve.

yea, this can be a good case study, which includes the case to support pure MCP server backends and the REST backends.

To support the pure MCP backend, there could be some complexity around the initializeRequest and initializeResponse if Envoy needs to serve as a MCP client to server.

We also want to gather feedback for which use cases are the most popular and close to the real world scenario.

botengyao avatar Apr 22 '25 18:04 botengyao

i am interested in this too, wish for a roadmap and contribute to it!since i am also doing job about mcp recently~

duxin40 avatar Apr 24 '25 16:04 duxin40

👀 Interested too!

StarryVae avatar Apr 30 '25 06:04 StarryVae

This is great! Higress has already developed an MCP Server Wasm plugin based on go 1.24, which is functionally consistent with the basic direction described here. This is the documentation for the plugin: https://higress.cn/en/ai/mcp-server

We have also developed a tool that can automatically convert openapi to the configuration of this Wasm plugin: https://github.com/higress-group/openapi-to-mcpserver

Based on this mechanism, we have built the Higress MCP marketplace: https://mcp.higress.ai/

However, the Wasm ABI used by the plugin and the implementation of proxy-wasm-cpp-host are slightly different from those currently used by the Envoy community, so it cannot be used directly in the official Envoy distribution yet. If the community is interested in this solution, we can do some work next to make this plugin usable in the official Envoy distribution as well.

We have aligned the changes on the proxy-wasm-cpp-host side in this PR: https://github.com/proxy-wasm/proxy-wasm-cpp-host/pull/433

johnlanni avatar May 09 '25 08:05 johnlanni

This is great! Higress has already developed an MCP Server Wasm plugin based on go 1.24, which is functionally consistent with the basic direction described here. This is the documentation for the plugin: https://higress.cn/en/ai/mcp-server

We have also developed a tool that can automatically convert openapi to the configuration of this Wasm plugin: https://github.com/higress-group/openapi-to-mcpserver

Based on this mechanism, we have built the Higress MCP marketplace: https://mcp.higress.ai/

However, the Wasm ABI used by the plugin and the implementation of proxy-wasm-cpp-host are slightly different from those currently used by the Envoy community, so it cannot be used directly in the official Envoy distribution yet. If the community is interested in this solution, we can do some work next to make this plugin usable in the official Envoy distribution as well.

We have aligned the changes on the proxy-wasm-cpp-host side in this PR: proxy-wasm/proxy-wasm-cpp-host#433

Great! and i have a small question, is it possible for wasm go plugin to start a sse server if this plugin is used to convert mcp to rest?

StarryVae avatar May 09 '25 08:05 StarryVae

@StarryVae This plugin only implements mcp to rest based on the Streamable HTTP protocol. If you need to be compatible with the first version of the MCP protocol, which is based solely on POST + SSE, you will need to address the state synchronization issue between the two requests (currently, this is achieved in Higress through another filter that connects to redis). However, I personally believe that the POST + SSE MCP protocol will gradually be replaced by the Streamable HTTP protocol. Here is our analysis and comparison:

https://higress.ai/en/blog/mcp-protocol-why-is-streamable-http-the-best-choice

johnlanni avatar May 09 '25 08:05 johnlanni

Oh, i see, your plugin only implements mcp to rest based on the Streamable HTTP protocol (POST), a GET request to open sse is not supported yet? @johnlanni

StarryVae avatar May 09 '25 09:05 StarryVae

@StarryVae Yes, we only implemented the stateless part of the Streamable HTTP protocol, because most mcp to rest scenarios are stateless

johnlanni avatar May 09 '25 09:05 johnlanni

@johnlanni, thank you for sharing! The official Envoy repository doesn’t maintain the Wasm plugin—it only supports built-in C++ filters. A good way to contribute is through the /contrib directory, and I’d be happy to review any PRs you open.

I think the Streamable HTTP protocol is still in its early stages and comes with several challenges, such as stateful management, Auth arch, multi-phase initialization, and the community is influencing the protocol.

That said, one of the core components is a JSON-RPC parser. Most MCP servers currently support only single-call JSON-RPC, but there’s a new RFC requiring batch support: https://github.com/modelcontextprotocol/modelcontextprotocol/pull/228

A good starting point would be building a parser that maps each JSON-RPC request 1:1 to an OpenAPI call. At the same time, I’d like to keep the door open for batch support — I'm considering a terminal filter that performs fan-out and reuses the existing router.

botengyao avatar May 13 '25 14:05 botengyao

@botengyao Thank you, we will contribute the wasm code of higress mcp server to /contrib next. We are also very willing to participate in the community development of built-in mcp to openapi filter.

johnlanni avatar May 14 '25 07:05 johnlanni

@botengyao What are the use cases you have in mind for "Transcoding JSON-RPC messages to existing API surfaces, for example gRPC and OpenAPI." ? Is it for exposing existing APIs via MCP? Curious what use cases you have in mind?

ramaraochavali avatar Jun 05 '25 08:06 ramaraochavali

@ramaraochavali, it will let Envoy behave like a JSON-RPC to OpenAPI converter to the actual backend, for example a tool/call /get_bucket with parameters MCP JSON-RPC request will be converted to a REST /get_bucket/your_bucket to backend. Basically, it is MCP -> Envoy -> REST/gRPC to backends.

There is the schema/draft/schema.json.

@johnlanni @ramaraochavali do you want to the kick off the built-in mcp to openapi filter work? We can start from the 1:1 map.

botengyao avatar Jun 06 '25 19:06 botengyao

@botengyao I am interested in participating in this work. I plan to complete the native filter for mcp to rest in August. Additionally, a colleague on our team is also developing the native filter for mcp to grpc, and we plan to contribute it to the community. If the timeline is acceptable, you can assign it to me.

johnlanni avatar Jun 09 '25 01:06 johnlanni

I can help with review of the API and code. It's happy to see this could be core part of Envoy. cc @botengyao cc @johnlanni

wbpcode avatar Jun 09 '25 02:06 wbpcode

Sounds good, thanks @johnlanni, please assign these PRs to me and I will review them. Excited this is happening in Envoy! Also thanks @wbpcode for the offer!

It's happy to see this could be core part of Envoy.

I really think Envoy needs to support these kind of traffic with the evolution of MCP/A2A, rather than creating a new proxy ;-)

botengyao avatar Jun 09 '25 04:06 botengyao

Seems these features can be implemented based on a normal http filter extension? i.e. Envoy Golang filter, https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/golang_filter

AFAIK, some people build their own MCP gateway based on Envoy Golang filter, it could be simple. Maybe we can supporting MCP based on Envoy Golang filter?

doujiang24 avatar Jun 14 '25 15:06 doujiang24

Seems these features can be implemented based on a normal http filter extension? i.e. Envoy Golang filter, https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/golang_filter

AFAIK, some people build their own MCP gateway based on Envoy Golang filter, it could be simple. Maybe we can supporting MCP based on Envoy Golang filter?

I have been subscribed and watching this issue because I too am wondering if it would really be accepted to go straight for native support in Envoy right away. My assumption about how things would go: MCP support would be done via dynamic/external extension first. Native support would come at a time later if the protocol proves itself, proliferates and becomes ubiquitous.

To that end, I wonder if the https://github.com/envoyproxy org be willing to host a repository for an experimental MCP extension? If so, it would help to concentrate the energy of all of us who are interested (otherwise, I imagine we'll have several disparate implementations all over the place) so we can work together in one place. But perhaps the org would prefer not to do that? I would be interested to know what people think 🤔

shaneutt avatar Jun 15 '25 16:06 shaneutt

To that end, I wonder if the https://github.com/envoyproxy org be willing to host a repository for an experimental MCP extension? If so, it would help to concentrate the energy of all of us who are interested (otherwise, I imagine we'll have several disparate implementations all over the place) so we can work together in one place. But perhaps the org would prefer not to do that? I would be interested to know what people think 🤔

What you mean by mcp extention? A native c++ extension? A module extension based on dynamic modules? Or WASM extension? Or as @doujiang24 said, golang extension?

If it's native C++ extension, then I think contrib of envoy repo would be a good position. If it's dynamic extension (dynamic modules/wasm/golang/etc.), I can discuss this with maintainers.

wbpcode avatar Jun 16 '25 08:06 wbpcode

What you mean by mcp extention? A native c++ extension? A module extension based on dynamic modules? Or WASM extension? Or as @doujiang24 said, golang extension?

If it's native C++ extension, then I think contrib of envoy repo would be a good position. If it's dynamic extension (dynamic modules/wasm/golang/etc.), I can discuss this with maintainers.

Personally I'm open to suggestion. I'm more concerned with trying to establish a focused collaboration than any specific implementation details at the onset, particularly in the contingency that the maintainers prefer that this be done as a dynamic extension.

shaneutt avatar Jun 16 '25 11:06 shaneutt

I really don't want to have to adopt another type of proxy. Envoy is in the best position to deliver value for MCP controls

joberdick avatar Jun 17 '25 14:06 joberdick

This is something we are actively working on w/ Pomerium (which is based on Envoy). There's a few components that make this challenging to merge back upstream, but would like to if we can. Would also welcome feedback on the discussion topic on the model context protocol spec itself for those interested in shaping that.

Related

  • https://github.com/modelcontextprotocol/modelcontextprotocol/discussions/804
  • https://github.com/pomerium/pomerium/issues/5672

desimone avatar Jun 27 '25 17:06 desimone

There is already a JSON-RPC issue, https://github.com/envoyproxy/envoy/issues/14621 . Would implementing this proposal also satisfy that request?

esnible avatar Jun 30 '25 13:06 esnible

To that end, I wonder if the https://github.com/envoyproxy org be willing to host a repository for an experimental MCP extension? If so, it would help to concentrate the energy of all of us who are interested (otherwise, I imagine we'll have several disparate implementations all over the place) so we can work together in one place. But perhaps the org would prefer not to do that? I would be interested to know what people think 🤔

What you mean by mcp extention? A native c++ extension? A module extension based on dynamic modules? Or WASM extension? Or as @doujiang24 said, golang extension?

If it's native C++ extension, then I think contrib of envoy repo would be a good position. If it's dynamic extension (dynamic modules/wasm/golang/etc.), I can discuss this with maintainers.

We will make the new repo https://github.com/envoyproxy/modules as a center to host various dynamic extensions from community. It's empty now. But we will fill it with CI/tools/initial examples in next few weeks. Hope it makes sense for all the developers of Envoy. :)

wbpcode avatar Jul 04 '25 12:07 wbpcode

Thanks @wbpcode 🥳

For this new repository, what is the process for adding a new module? Will there be a proposal process to add new modules so that it doesn't become a rush to push code?

shaneutt avatar Jul 07 '25 13:07 shaneutt

Thanks @wbpcode 🥳

For this new repository, what is the process for adding a new module? Will there be a proposal process to add new modules so that it doesn't become a rush to push code?

We are still building the base and will send a notification once it's ready to accept contributions. :)

I think finally it would has much looser rules than the main envoy repo to encourage more contributions.

We will finally document it at a contributing md.

wbpcode avatar Jul 07 '25 15:07 wbpcode

Thank you all, back from a short leave, and yes, we can use the new repo to contribute some extensions.

Just a quick update from MCP:

As of version 2025‑06‑18, MCP no longer allows clients to send JSON‑RPC batch requests (i.e., arrays of multiple calls in one go). The protocol explicitly prohibits batching, and servers following this version will reject such requests, which further simplify the Envoy work.

botengyao avatar Jul 08 '25 17:07 botengyao