agent-protocol
agent-protocol copied to clipboard
Add an Endpoint to get Agent Info
Is your feature request related to a problem? Please describe. I've run into a case where I need to display different agent details as needed based on the agent.
Things such as:
- Name
- Version
Describe the solution you'd like
I would like a GET
Endpoint Added to the protocol to request information about the agent
Describe alternatives you've considered Requiring this information to be queriable outside the agent protocol is possible, but non linear
Another thing we discussed was
Author, Help Text
Continue Message
: yes
or continue
or something like that
I believe we've settled on the following format for a route existing at /ap/v1/agent/info
:
{
"name": "My Agent",
"description": "General purpose agent.",
"version": "1.0.0",
"protocol": "1.0",
"github": "https://github.com/myagent/myagent",
"url": "https://myagent.com",
"docs": "https://myagent.com/docs",
"issues": "https://github.com/myagent/myagent/issues"
}
Additionally to this schema we discussed having a config_options
array of objects with the properties of type, default, description,and options (optional). Where type refers to the type of input, and options is an array of that type that show what is available. This config_options
array would be used as an info endpoint for clients to use against the tasks and steps in their config options.
We should also have the "schema_plugin" array that is the plugin system
Here is the AgentInfo object I had been thinking of using for the spec:
AgentInfo:
type: object
properties:
name:
description: Name of the agent.
type: string
example: My Agent
description:
description: Description of the agent.
type: string
example: My agent is the best agent.
version:
description: Version of the agent.
type: string
example: 1.0.0
protocol_version:
description: Version of the agent protocol.
type: string
example: 1
github:
description: GitHub repository of the agent.
type: string
example: 'https://github.com/AI-Engineers-Foundation/agent-protocol'
url:
description: URL of the agent.
type: string
example: 'https://my-agent.com'
docs:
description: Link to the documentation of the agent.
type: string
example: 'https://my-agent.com/docs'
issues:
description: Link to the issues of the agent.
type: string
example: 'https://github.com/AI-Engineers-Foundation/agent-protocol/issues'
config_options:
description: List of configuration options for the agent's tasks and steps. The config is a user-defined set of key/value pairs where the values are standard but the keys are not.
type: object
example: |-
{
"debug": {
"type": "boolean",
"default": false,
"description": "Whether to run the agent in debug mode."
},
"model": {
"type": "string",
"default": "gpt-4",
"description": "The model in which the agent's tasks should run."
}
}
additionalProperties:
type: object
properties:
type:
description: The type of the value.
type: string
enum:
- string
- integer
- float
- boolean
- list
- dict
default:
description: The default value of the config option.
type: string
description:
description: 'A description of the value with type, default value, and description.'
type: string
options:
description: A list of options for the config option.
type: array
items:
oneOf:
- type: string
- type: integer
- type: number
- type: boolean
- type: object
- type: array
items:
oneOf:
- type: string
- type: integer
- type: number
- type: boolean
- type: object
required:
- type
- default
- description
example: |-
{
"type": "string",
"default": "gpt-4",
"description": "Model for the agent's steps to use."
"options": ["gpt-4", "gpt-3.5-turbo", "gpt-3.5-turbo-16k"]
}
description: 'A description of the value with type, default value, and description.'
required:
- name
- version
- protocol_version
- config_options
I like this proposal and agree with a need to standardize resource provision of some subset of Agent metadata. However, I could see a situation where some of the metadata proposed (i.e. agent name, github, issues, url, docs) is considered proprietary or sensitive and gated off to authenticated users. In that case, the Agent implementer would have to either 1) not conform to the spec with less than the entirety of Agent info provided, or 2) not be able to retrieve the entire metadata info
without authenticating first.
I think there is a happy middle ground where things like version
and protocol_version
might be supplied (leaving the Agent implementers to associate the rest of the metadata with the specific version released/deployed). I think something like config_options
is especially sensitive if this endpoint is always open, and I am not sure I understand the need for a duplicate url
property when this endpoint is being served by a deployed Agent.
I also have a concern about the default assumption that github
will always be the version control platform/repository of an Agent that conforms to the Agent Protocol, or that issues
will be publicly available. I think it's quite likely there will be paid vendor Agents hosted that should ideally conform to the Agent Protocol spec to prevent vendor lock-in but who will not want to conform if the spec is too prescriptive about implementation details that may not make sense for them.
Perhaps it would be worth discussing a limit to the protocol's provided top-level info
of the below?
- agent version (required)
- protocol version (required)
- name (optional)
- description (optional)
- additional properties (optional)
I derived those thinking about a universally applicable protocol relevant to all of the below, and believe that a protocol that is follow-able will be one that supports all of these:
- open source hosted Agents
- paid/vendor-hosted Agents
- unauthenticated requesters of Agents
- authenticated requesters of Agents
- kubernetes/distributed systems-level requesters of Agents
- github/gitlab/bitbucket/etc repository-using builders of Agents
The proposed endpoint /ap/v1/agent/info
with the the following body makes sense:
{
"name": "My Agent",
"description": "General purpose agent.",
"version": "1.0.0",
"protocol": "1.0",
"git": "https://myselfhostedgit.com/myagent/myagent",
"url": "https://myagent.com",
"docs": "https://myagent.com/docs",
"issues": "https://github.com/myagent/myagent/issues"
}
+1 on not assuming that github is the only place where the code can exist.
There is also a proposal to introduce the plugin system under the info endpoint. However, I would argue that the protocol does not need to be aware of the plugin. This means that the protocol does not depend on plugin and only plugin depends on the protocol.
In any case there are no plugin scenarios brought out in the plugin issue thread except for auth.
Auth
I would propose to support auth in the core agent protocol.
Let’s consider some common authorization methods:
-
None - well none, client does not pass anything in the
Authorization
header. -
Basic - client sends HTTP requests with an
Authorization
header that contains the wordBasic
followed by a space and a base64-encoded stringusername:password
. -
Bearer Token (or Token Authentication) - client sends an HTTP request with an
Authorization
header containing the wordBearer
followed by a space and a token. -
JWT - client sends an HTTP request with an
Authorization
header containing the wordBearer
followed by a space and a JWT token. -
OAuth2.0 - client sends an HTTP request with an
Authorization
header containing the wordBearer
followed by a space and a oauth token.
In the real world there are alternative ways how the authorization token is sent. The 2 alternatives to header I see are in query parameter (?auth_token=mock_token) and in payload of POST query. As Agent Protocol has already GET and POST requests it would introduce unnecessary complexities and I would conclude that the most logical path forward would be to include the authorization in the header.
By including the Authorization
header in the Agent Protocol we could support all? common authorization methods mentioned above. In addition as this is an optional field then it can already be added on the agent implementation side without breaking anything.
What is needed would be a way for the agent to say that the Authorization is needed. I would propose using the config_options
for this:
{
"config_options": {
"auth": {
"type": "string",
"default": null,
"description": "The authentication method to use.",
"options": [
null,
"basic",
"bearer_token",
"oauth2",
"jwt"
]
}
}
}
For example if the agent is using JWT then the /info endpoint would return:
{
"version": "1.0.0",
"protocol": "1.0",
"config_options": {
"auth": "jwt"
}
}
Plugin
What we in https://agentwallet.ai/ are building is essentially a payment plugin. The most convenient way to accomplish paying to the Agent is through our platform. There would be no need for any protocol or agent code changes. The payment plugin would be in front of the agent protocol. This is my understanding how the plugins should work: “adds extra functionality to the existing software without modifying the existing software”
Proposed solution to include Auth into Agent Protocol: https://github.com/AI-Engineer-Foundation/agent-protocol/pull/80
Really like this perspective.
Discussed with the community:
- we are giving a "Green Light" on this
- It helps to include some documentations on how to acquire these tokens for auth for various auth type: jwt, oath2 etc (this could be included in the description for auth)
@hackgoofer Thanks for the update!
- Do I understand correctly that the green light is for the Auth approach, not the /info endpoint? (Would agree as these are 2 separate things now)
- Do you have some ideas on how to provide information on auth? I think it probably makes sense to go over the different scenarios.
What I imagine:
- completely private - I don't want anyone to know how to access the auth credentials. The agent is running inside kubernetes or my gated system -> No documentation on how to get tokens
- paid - you can sign up to my agent at my company website and you get the API key etc.. -> Documentation on how to get tokens is in my developer docs.
- automatic token generation - I want users/clients to register, can be done automatically so that I can allow only 1 access token task running at the same time to protect myself from DDOS (or something similar) -> I think this is the scenario where the protocol needs to document how to get tokens
- no auth - no docs
Maybe the community can share some ideas in here how they would like to use the Auth?
My understanding is the three enclosed topics have all been green lit. I'm working with @jzanecook on the info and config object details and RFCs.
@KasparPeterson would you mind writing up an RFC for auth?
You could consider having info as a .well-known/
(maybe .well-known/agent
?)
I was involved in a project that had payment pointers, which seem comparable to agent info endpoints
https://paymentpointers.org/
The payment pointer/wallet endpoint also specified an authServer
(but the only auth supported was GNAP) as well as a resourceServer
where the existing agent protocol endpoints could live.
https://openpayments.guide/apis/wallet-address-server/operations/get-wallet-address/