registry icon indicating copy to clipboard operation
registry copied to clipboard

Preventing tool poisoning: save signatures of possible tool calls

Open tadasant opened this issue 6 months ago • 15 comments

One potential benefit of a centralized registry is that we could have server.json submitters list out all the possible tools their server may ever invoke, fingerprint them, and store those fingerpoints for MCP client consumption.

A third party vendor could scan and approve these fingerprints as devoid of security risks, like tool poisoning attacks.

MCP clients could then use the fingerprints to avoid tool poisoning attacks that get surfaced due to hidden dynamic tool calls or supply chain attacks.

tadasant avatar May 27 '25 22:05 tadasant

I had a request on this for VS Code. I didn't dive deep into it but there are a couple scenarios that complicate this.

  1. Servers that provide a dynamic tools
  2. Tools that localized (https://github.com/modelcontextprotocol/modelcontextprotocol/pull/115)
  3. Tools that may change depending on the model (https://github.com/modelcontextprotocol/modelcontextprotocol/issues/469)

Just letting clients do this may make more sense because the client will know at least (2) and (3) or know when they change. But the ecosystem tooling here is still quite immature.

connor4312 avatar May 27 '25 22:05 connor4312

If we focus on the notion of possible tools (i.e. any tool that could ever surface, beyond just at-initialization), those three problems go away, right?

tadasant avatar May 27 '25 22:05 tadasant

Yes. It's a bit of a pain because, given both (2) and (3) happen it's a NxM multiplier to the number of tools, but it's possible.

connor4312 avatar May 27 '25 22:05 connor4312

(though practically only a relatively small number of servers will get localized into a large number of languages)

connor4312 avatar May 27 '25 22:05 connor4312

I think tool representation overall, either in the registry spec or MCP metadata, would be useful. Currently clients need to start servers in order to know what tools, prompts, etc. are available. Because MCP servers can now do nice authentication, this can lead to a bunch of login prompts for the user when they first start interacting with their language model. If servers optionally (there might always be servers with nondeterministic tools) published their starting set of tools/prompts, then we could avoid this.

connor4312 avatar May 30 '25 18:05 connor4312

I think tool representation overall, either in the registry spec or MCP metadata, would be useful. Currently clients need to start servers in order to know what tools, prompts, etc. are available.

Agreed, in particular for tool representation in MCP metadata. Unrelated to tool poisoning, but it might allow clients to statically index tools for client-side tool search.

jonathanhefner avatar May 30 '25 20:05 jonathanhefner

Hi everyone, I'm glad that @tadasant raised this.

We are facing a challenge that would also be solved by having easy access to the tools provided by each MCP server. In our case, we want to avoid forcing a user to authenticate to a service before we can even confirm it has the tools to solve their problem (via tools/list).

Could the server schema within the registry API be augmented to include tools? This would allow a server operator to choose to list their tool schemas publicly as part of the discovery process.

There are several benefits to this, from improved user experience for cases like ours, to more accurate MCP server selection in case there is more than one for a particular service, not to mention improved transparency within the ecosystem.

A nuanced “possible tools” angle makes sense considering dynamic cases as @connor4312 pointed out. However, while we want to support all cases, we do want to optimize for the common case. And the majority of MCP servers does have a static list of tools.

Happy to contribute further details on our use case or to take part in the discussion on the details.

goncalossilva avatar Jul 25 '25 21:07 goncalossilva

Thanks @goncalossilva ! I think it'd be reasonable for someone to take on adding a field for statically defined tools.

From a guidance perspective, we could say SHOULD enumerate all possible tools.

Steps to contribute:

  1. Update https://github.com/modelcontextprotocol/registry/blob/4f5e85b51073bc878949850a44bd66d42e2d59fa/docs/server-json/schema.json
  2. Update https://github.com/modelcontextprotocol/registry/blob/main/docs/server-json/registry-schema.json
  3. Update https://github.com/modelcontextprotocol/registry/blob/main/docs/server-registry-api/openapi.yaml
  4. Include examples in the adjacent files, update any code this impacts

For the first update, would expect to see a thoughtful canvassing of any other precedent out there for this feature. For example, Anthropic's DXT format is probably good precedent for this / maybe we should align with their shape for tools. And the team there would probably have an opinion on this as well cc @felixrieseberg

tadasant avatar Jul 26 '25 14:07 tadasant

Would we rely on the author of the MCP server to populate the list of tools? If so should there be a way to prove/validate these upon publishing? What I mean is this raises a concern that bad actors can populate this list intentionally in a wrongful way which may cause malicious results further down. I'm supportive of having the tool's list in advance, just raising that it would be nice if there's a way to guarantee its correctness upon publishing and/or consuming.

rdimitrov avatar Jul 29 '25 09:07 rdimitrov

I don't think we can pursue running servers and introspecting them - there is too much of a rabbit hole there. What do you do for remote servers? What do you do for auth gates? What do you do for dynamic lists that won't be present on initial startup? Not to mention the quagmire that is trying to get them to run at all in a unified way across all the possible package registries.

We should think through what exposure there might be and give guidance on "how consumers should use this data" accordingly. At the end of the day, I only expect consumers to use this data as a kind of search/filtering/documentation mechanism; it shouldn't be "run" in any way, so not even prompt injection should be a concern here (perhaps we should give guidance to not pull this data into inference for any reason).

@goncalossilva can you share more about the nature of your intended use case to help guide this level of trust factor?

tadasant avatar Jul 29 '25 15:07 tadasant

Steps to contribute

Thanks for these! I'm happy to push these forward soon.

can you share more about the nature of your intended use case to help guide this level of trust factor?

We're exploring using LLMs to convert prompts into workflows with minimal manual work from the user. The upcoming MCP registry is a great start to help us discover available MCP servers we might want to use, but that's generally not enough. We also need to know which capabilities are available in each MCP server.

We could do this back and forth of first identifying relevant MCP servers, then handling authentication, and only then listing tools and preparing the workflow. And for now, we plan to do exactly that. But we're concerned about the use case where it's only after all of that, including effort from the user themself, that we realize we don't have the right tools to carry out the job. Determining that before the user puts in any effort would be much better UX.

Does this help?


For what it's worth, I agree that having to prove the list of tools is complex and I wouldn't consider it for a first version. Trust is paramount, but this documentation would be authored by the MCP server authors themselves. Potentially even automated via the SDK, reducing the risk of even unintentional errors.

goncalossilva avatar Jul 30 '25 00:07 goncalossilva

Does this help?

That makes sense! And aligned with what I was expecting / I stand by my earlier thoughts.

Potentially even automated via the SDK, reducing the risk of even unintentional errors.

Curious what exactly you mean by this?


Thank you for jumping in on this!

tadasant avatar Jul 30 '25 14:07 tadasant

Potentially even automated via the SDK, reducing the risk of even unintentional errors.

Curious what exactly you mean by this?

This is something I still have to explore, but seeing how users of MCP's SDKs annotate their tools (e.g., @mcp.prompt(title="Do something") in Python) it may be possible to automate creating the “manifest” of tools for the registry, so there is less manual work involved, and consequently, it becomes less error-prone.

Thank you for jumping in on this!

My pleasure! Let's see where this goes. I'm off next week but planning to take some of the steps above after I'm back.

goncalossilva avatar Jul 31 '25 13:07 goncalossilva

cc @joan-anthropic thought you might find this of interest as something DXT would want to see land

tadasant avatar Aug 18 '25 13:08 tadasant

This is an API I'm working on to have a VS Code extension version of the server manifest with static Tools declarations, to serve as possible inspiration: https://github.com/microsoft/vscode/issues/272000

connor4312 avatar Oct 18 '25 17:10 connor4312