registry
registry copied to clipboard
Support an array of commands (multiple commands) to start an MCP server instead of just one
Is your feature request related to a problem? Please describe.
For some ecosystems, multiple commands are needed to install/update and then run an MCP server. For example, before the upcoming .NET 10 release, it is not possible to run a .NET tool in a single-shot manner.
We are working on dnx analogous to npx but the requirement for single-shot essentially forced us to add a dnx experience. This is probably a net good but I am thinking about other ecosystems or complex MCP setups that would need multiple commands to get running.
Describe the solution you'd like
The current schema is pretty specific, not allowing the main command to be set explicitly, except via the runtime_hint (AFAIK).
What this means is private MCP registries adhering to the same protocol don't have a spot to put their language-specific runner.
Current: https://github.com/modelcontextprotocol/registry/blob/6b22cf09b376ed94c772c69951a2d125e495b26b/docs/openapi.yaml#L182-L200
The OpenAPI YAML could express an array of commands instead of just one. A new command_name optional property be added (instead of implied by the registry type).
For .NET (before dnx exists), the multiple commands to launch a .NET MCP server would be:
dotnet tool install --global ErikEJ.DacFX.TSQLAnalyzer.Cli --version 1.0.25
tsqlanalyze -mcp
This is not to say the main, public MCP registry will not validate or restrict what commands could be run, but it does allow more flexibility for newer language ecosystems or private deployments of an MCP registry.
Multiple commands unlock these kinds of scenarios, generally:
- Interactive auth for the first command, starting the MCP server as the second
- Install, then run (like pre .NET 10).
- configure a private package source first (
npm config set registry), thennpx
Said another way, this removes the single-shot requirement for a language/runtime ecosystem to onboard to MCP server hosting.
Describe alternatives you've considered
- Run the setup steps out of band (manually). This is a burden on the end user and makes an single click "install" experience not possible.
- Supporting shell scripting as the entry point (which can internally do multiple commands)
- This feels to flexible and hidden -- where to host the shell script?
- It is likely OS specific
Additional context I have opened a similar issue on VS Code (https://github.com/microsoft/vscode/issues/249370) but I now understand this is a more broad discussion than just VS Code. Thanks @connor4312 for chatting with me a bit already.
We actually explicitly removed the "command_name" in an earlier revision to the spec. See the discussion here: https://github.com/modelcontextprotocol/registry/pull/3#discussion_r2074192996 It's up to the client to use the correct runner logic (e.g. uvx or npx) for any given package. For dotnet support, that logic would just involve them also making sure the install step is run at some point before the executing the package.
That said I do think there is need for a shell type fallback so that new package manager types added to the registry can work with all clients without requiring them to explicitly add support for it, and I think you raise a good point that allowing multiple commands in that fallback would be useful.
Also @tadasant I think we should document the algorithim to get from a registry to a command line since it's not immediately obvious from the (kind of large) API shape and we've have multiple issues around it 😛 I'll make a PR for that and also try and figure out what a 'shell' type might look like.
The shell type makes sense. If you allow an array of commands potentially each of shell type, you have a clearly visible shell script, in effect. IMO it would not be great to support running a bash/windows batch/powershell/etc etc since there's the hosting problem (mentioned in your linked thread) and the commands are kind of hidden away, making an oppurtunity for an MCP registry-specific artifact to hide nasty stuff. The nasty stuff angle is of course ignoring the nasty stuff in the package itself, but it's probably good to limit places scary things can be.
You could imagine a VS Code install experience where you install from MCP registry with an array of shell commands and you can view them right there, before running them (much like commands suggested by agent mode).
I think we can reuse the runtime_hint in the shell mode to hint that this should be bash vs powershell vs... well, probably only those two realistically since we can be sure one of them is available on most systems. Shell scripts that unadorned, unescaped commands independent of shell can just omit the hint.
Away at a conference today but just quickly before @connor4312 you start investing in designing for shell -- I do have strong concerns about this that @joelverhagen is raising too:
IMO it would not be great to support running a bash/windows batch/powershell/etc etc since there's the hosting problem (mentioned in your linked thread) and the commands are kind of hidden away, making an oppurtunity for an MCP registry-specific artifact to hide nasty stuff. The nasty stuff angle is of course ignoring the nasty stuff in the package itself, but it's probably good to limit places scary things can be.
Part of the whole point of delegating out to other registries in this architecture at all is to avoid being responsible for hosting and managing raw code and all the concerns that come with that. Introducing shell basically pokes a major hole in that benefit, and would start to beg the question why bother with all the package options at all (why not just a single option, shell?).
I want to think a little more on alternative solutions here. Definitely hear the original pitch that one-shot commands feel limiting.
Also @tadasant I think we should document the algorithim to get from a registry to a command line since it's not immediately obvious from the (kind of large) API shape and we've have multiple issues around it 😛 I'll make a PR for that and also try and figure out what a 'shell' type might look like.
Agree we need some sort of FAQ or WIP design overview to help communicate the current thinking better. Happy to collab on this.
If for whatever reason multiple commands are needed to get an MCP running, the current schema pushes the complexity of that into the client. Suppose there is a nuget type. We could use the existing model for expressing some part of the arguments needed, or a concatenation, and the VS Code (or some other arbitrary consumer) could make sense of it (special cased on nuget type) and produce the needed commands.
I don't think that is any better because then the actual commands to run are invisible, hidden in the client implementation and perhaps duplicated across multiple implementations.
I am not sure what alternatives there are beside a) supporting multiple explicit commands or "command types" for a single MCP server or b) special case on package type to expand the needed list of commands at runtime. The option c) of allowing a .ps1 or .sh script to be shipped seems to have more drawbacks than a) or b).
I agree with @tadasant's comment from https://github.com/modelcontextprotocol/registry/pull/3#discussion_r2074533001 as someone building a Client, fwiw. Running any shell commands on our users machines is always going to carry some amount of risk, but to me, the risk of running arbitrary bash/powershell/etc. code is far scarier.
Package registries aren't a silver bullet of security by any means (see the various npm supply chain attacks), but they are a line of defense that Clients would inherit by way of this spec only supporting well know registries and tools built specifically for those registries.
The risk of what's in a server is there no matter what, but the risk of systems being exploited "directly" only exists if the spec allows arbitrary code.
And without the ability to run arbitrary code, I'm not sure multiple commands makes sense?
And without the ability to run arbitrary code, I'm not sure multiple commands makes sense?
For .NET at least, it makes sense to have multiple commands. To run a .NET CLI tool, you need a two-step thing:
First, run dotnet tool install --global <package ID> --version <package version>
Second, run <command name>
(the first command puts a new executable in an existing PATH dir, the second leverages a new executable in the PATH)
(this will be fixed when we ship dnx but I'm speaking about currently shipped capabilities)
I think this feature could generalize to side-car processes, auth setup, feed configuration step, and ecosystems without a single shot tool -- all invoked via CLI commands.
Because we don't have this today, some teams have bundled .NET tools into npm packages which, in the end, I think is no better for the broader ecosystem (compiled C# in a npm package is think is harder to audit and review than compiled C# in a .NET package since registries can rationalize and validate their own code types better).
IMO the scary part about a shell script is that it's hidden away from view and can have dynamic logic (if/loop, env introspection -- it's basically arbitrary code). An array of commands to execute sequentially could be validated at MCP registry upload time (for example only allowing specific executable names) and displayed to the user before they are executed.
The placeholders/input variables are well defined in the OpenAPI spec, as opposed to a shell script being able to read any env vars.
I want to think a little more on alternative solutions here.
The main 'risk' in not having a shell is that client support may vary widely. But if we provide some de-facto standard utility library to handle these things that most clients can use, maybe that would alleviate the concern? It could then have the logic smarts for dotnet to do the two-step install and run logic. And when there's a new package type added to the registry's enum, the library gets updated so clients simply need to pull in changes to support new managers.
I'm dubious about whether this is functionality that should be in the SDK's since it's technically separate from the whole actual implementation of clients/servers. But it could be.
I would be willing to help resource such a thing, since I'm going to need to implement all that for vscode anyway 😛
I am not sure what alternatives there are beside a) supporting multiple explicit commands or "command types" for a single MCP server or b) special case on package type to expand the needed list of commands at runtime. The option c) of allowing a .ps1 or .sh script to be shipped seems to have more drawbacks than a) or b).
I think this is a good summary of our options. I'm leaning either (a) or (b).
And without the ability to run arbitrary code, I'm not sure multiple commands makes sense?
The .NET example is fairly reasonable IMO:
dotnet tool install --global ErikEJ.DacFX.TSQLAnalyzer.Cli --version 1.0.25
tsqlanalyze -mcp
It's nice that we have things like npx and uvx, but even those haven't been around forever, so I think it's reasonable to think through how to accommodate this case.
I'm dubious about whether this is functionality that should be in the SDK's since it's technically separate from the whole actual implementation of clients/servers. But it could be.
If I'm following correctly, this is aligned with was @joelverhagen called (b) above. While I am a little nervous about pushing this down to SDK implementations - we're introducing a piece of work that permutates across every SDK x every ecosystem that needs special treatment - I do think a "registry SDK" concept (whether or not is is baked into the official server/client SDK) is inevitable.
And going the path of (a), I think we end up with quite a loaded shape. I tried to flesh it out a bit, and I think the simplest iteration would look something like:
Package:
type: object
...
properties:
...
command_hints:
type: array
description: Command line invocations needed to install and run this package.
items:
$ref: '#/components/schemas/CommandHint'
...
CommandHint:
type: object
properties:
runtime_hint:
type: string
description: A hint to help clients determine the appropriate runtime for the package. This field should be provided when `runtime_arguments` are present.
examples: [npx, uvx]
runtime_arguments:
type: array
description: A list of arguments to be passed to the package's runtime command (such as docker or npx). The `runtime_hint` field should be provided when `runtime_arguments` are present.
items:
$ref: '#/components/schemas/Argument'
pass_in_package_name:
type: bool
package_arguments:
type: array
description: A list of arguments to be passed to the package's binary.
items:
$ref: '#/components/schemas/Argument'
And now we're just getting into very bespoke notions of runtime, commands, etc -- and all just so we can codify something like dotnet tool install --global ErikEJ.DacFX.TSQLAnalyzer.Cli --version 1.0.25, which I assume is basically consistent (dotnet tool install --global {{name}} --version {{version}}) across all .NET packages.
All to say, it seems much simpler to just kick this complexity to the client (in the spirit of MCP's focus on simplifying servers vs. pushing complexity to clients) and plan for an associated SDK. I don't think we necessarily need to move SDK work in-scope for an MVP launch on account of this need, as the happy path of using uvx, npx, or dnx should still be fairly straightforward for clients to implement without an SDK initially.
So all to say, I would model this like so:
dotnet tool install --global ErikEJ.DacFX.TSQLAnalyzer.Cli --version 1.0.25
tsqlanalyze -mcp
{
"name": "io.github.joelverhagen/TSQLAnalyzer",
"description": "MCP server for TSQL",
"version_detail": {
"version": "0.0.1",
"release_date": "2023-06-15T10:30:00Z",
"is_latest": true
},
"packages": [
{
"registry_name": "dotnet",
"name": "ErikEJ.DacFX.TSQLAnalyzer.Cli",
"version": "1.0.25",
"runtime_hint": "tsqlanalyze",
"runtime_arguments": [
{
"type": "named",
"description": "Start MCP server",
"name": "-mcp",
"is_required": true,
"is_repeated": false
}
]
}
]
}
And it's on the client (and eventually SDK) to massage that into the two-step command.
Looking back at the other arguments for multi-step commands:
- Interactive auth for the first command, starting the MCP server as the second
- Install, then run (like pre .NET 10).
- configure a private package source first (npm config set registry), then npx
They are interesting in theory but I think we should stay focused on ways we're seeing people use these actively in practice. e.g. there is a lot of auth-related work going on, and encouraging mechanisms like this would just expose workarounds to that parallel track; configuring package source feels like a workaround to the registry API's functionality itself. So I'm not totally convinced there are strong use cases besides "install, then run" in play right now.
The only thing this makes me nervous about is it means we need to keep runtime_hint a free-form string. But I think that's OK, we'll just need to sanitize it thoughtfully to keep that from being an attack vector.
What do you guys think @connor4312 @joelverhagen @mnoble ? I'm fairly strongly against the (1) shell approach, slightly opposed to the (2) complex multiple-command shape, and leaning towards this (3) client/SDK approach. But curious to hear if you can think of tweaks we can make to the shape to make the (3) SDK approach simpler and/or if you have a more elegant way to execute on (2) than my try above.
But if no objections, by default I think we plan for the (3) approach as described here.
I think your thoughts are very reasonable for initial release. Private deployments perhaps can overload runtime_hint with shell or whatever and stuff what they need in runtime_arguments. I think it's still an interesting idea for the other multi-step command cases I shared but they are theoretical.
Feel free to close until we have a more pressing need, or maybe remove the tag and collect upvote and other more concrete scenarios?
FWIW, NuGet will be unblocked very soon. Single-shot tool execution was merged today and will be available in .NET SDK Preview 6.
I had pretty much the same line of reasoning. I agree with you.
The only thing this makes me nervous about is it means we need to keep runtime_hint a free-form string. But I think that's OK, we'll just need to sanitize it thoughtfully to keep that from being an attack vector.
So the risk with the hint it clients just putting it into a command line without looking at it:
- There is the desire to avoid having moderation on the registry, so we may want to sanitize it.
- If we're to the step where the command is getting executed, it's about to go out and run some package provided by that developer anyway
but persuant to (2), it does bypass any protections the client might have by e.g. using a trusted registry mirror. My initial thought was that we can ban slashes and stuff from the runtime hint to avoid tricky business, but there are valid tools that could be used in a valid way. E.g. I could have a registry package that generates some command like scp ~/.ssh/id_rsa [email protected] (would probably need to be a little creative to get the package name interpolation right, but the point stands.)
I think in the end it just comes down to having good, clear consent flows in the client which show both the package and the command to be executed.
@connor4312 is VS Code going to show the npx/uvx/dnx command before executing it? Or I guess it's viewable in the client mcp.json file after installation?
It seems like if our data model is limited in what commands can be expressed (like we have in the repo today), then reviewing the command is less necessary. But I wonder what the VS Code decision on this matter is. [edit: I ask because I am guessing VS Code will be influential in MCP registry consumption norms due to how popular/visible it is, and will likely influence other users of this registry]
is VS Code going to show the npx/uvx/dnx command before executing it?
Yes, we'll do that. Especially for Docker commands that will generally be isolated but also still access system resources, I'll want to show the user the command so they don't assumed it's sandboxed where it might actually access more than they expect.