vscode icon indicating copy to clipboard operation
vscode copied to clipboard

Chat agent API

Open roblourens opened this issue 2 years ago • 53 comments

  • Proposal dts: https://github.com/microsoft/vscode/blob/main/src/vscode-dts/vscode.proposed.chatAgents2.d.ts
  • Sample: https://github.com/microsoft/vscode-extension-samples/tree/main/chat-agent-sample

Extension authors can subscribe to this issue to get updates about the proposed chat agent API. There may still be breaking changes coming, and I will post here whenever a breaking change is made.

I'm also interested in feedback about how you might use this API.

Current TODOs

  • [x] Need to enable agents to identify sessions uniquely and store state, including across a window reload
  • [x] Should be able to use shouldRepopulate for agents
    • Maybe this should be the default behavior for agents
  • [x] ChatAgentReplyFollowup should parameterize the agent and slash command names, don't require the agent to include them in the prompt as text
    • Maybe this goes for variables too, but variables like #file are complex
  • [x] https://github.com/microsoft/vscode/issues/204539
  • [x] https://github.com/microsoft/vscode/issues/203822
  • [x] Change the history context shape to be a flat list of messages, instead of request/response pairs
  • [ ] Understand how duplicate agent IDs should be handled
  • [ ] Enable agents to resolve a variable value on-demand, instead of resolving all variables before invoking the agent

  • [ ] Add a persist flag to ChatAgentProgressMessage for it to stay visible with a ✔️ even when more content has been added.
    • This would be useful when you want the progress message to remain and show what happened, eg in representing a function call. I've seen multiple extension authors add their own version of this just with text.
  • [ ] https://github.com/microsoft/vscode/issues/197081

  • [x] ChatAgentCommandFollowup should become a progress message instead of a followup
  • [x] In history, agents have to parse agent names and slash commands from the raw ChatMessage. They shouldn't have to understand the input syntax- the type in history should be related to the request and progress/result types that we already have.
  • [x] Add generic type parameter for the type of the result, which can include custom properties
  • [x] 'variables' proposal is referenced from chatAgents2, need to merge it in
  • [x] https://github.com/microsoft/vscode/issues/200598

roblourens avatar Dec 03 '23 22:12 roblourens

November additions

  • Send a ChatAgentProgressMessage to show progress steps on your response. @workspace uses this. https://github.com/microsoft/vscode/blob/e4cb2cf8f5917c07cda2be4c9422878f09d9397b/src/vscode-dts/vscode.proposed.chatAgents2.d.ts#L261-L266
  • Implement a ChatAgentCompletionItemProvider for variables to show up that are only for this agent. Other chat variables are global. https://github.com/microsoft/vscode/blob/e4cb2cf8f5917c07cda2be4c9422878f09d9397b/src/vscode-dts/vscode.proposed.chatAgents2Additions.d.ts#L58-L60

roblourens avatar Dec 03 '23 22:12 roblourens

any comment on this https://archive.ph/N5iVq#selection-1895.0-1897.140 ? Screenshot 2023-12-06 141137 why does github copilot think i need to talk to a lawyer to know if it's OK to use github copilot to develop AI?

bionicles avatar Dec 07 '23 21:12 bionicles

Has any research been done into the agent protocol for this?

ntindle avatar Dec 09 '23 01:12 ntindle

Do you have any example use cases of ChatAgentCompletionItemProvider?

InTheCloudDan avatar Dec 10 '23 18:12 InTheCloudDan

  • Do you have more information about the options to access.makeRequest?
  • Do you have more information about ChatAgentUsedContext and MappedEditsProvider?

pelikhan avatar Dec 11 '23 19:12 pelikhan

@bionicles I wouldn't rely on Copilot Chat for legal advice, talking to a lawyer may be a good idea.

@ntindle I think that's a different use of the term "agent" than how we're using it here, where it's just a UX paradigm for an extension that can execute a chat query. That might continue to be a point of confusion as people work more on AI agents that plan and use tools to accomplish a task in multiple steps.

@InTheCloudDan Not a good example, sorry, I'll try to add one to the API sample this month. The API might need a little more polish.

@pelikhan The current concept is that those options could be used differently by different chat providers. The copilot chat provider takes a numeric option n.

Re: MappedEditsProvider- I think this is sort of in an experimental state, there is one that is implemented by the Copilot Chat extension and just tries to fix the indentation of pasted codeblocks based on the indentation in ChatAgentUsedContext.

roblourens avatar Dec 13 '23 00:12 roblourens

How would you feel about calling it one of the following then sidekick, partner, assistant, or tool?

The name confusion I can see continuing

ntindle avatar Dec 13 '23 13:12 ntindle

@roblourens duh

bionicles avatar Dec 19 '23 19:12 bionicles

Breaking change in the next Insiders:

Renamed ChatVariableContext#message to prompt

https://github.com/microsoft/vscode/blob/b8baf8e153c5e1194f0d7b8593950e429829789e/src/vscode-dts/vscode.proposed.chatAgents2.d.ts#L420-L425

roblourens avatar Jan 12 '24 01:01 roblourens

You can now return a ChatAgentProgressMessage in the middle of a response- previously it only worked at the top. Once there is some content to render after the progress message, it will be removed.

ChatAgentTask will be removed soon, in favor of just sending a progress message followed by the content.

roblourens avatar Jan 15 '24 16:01 roblourens

Possibly breaking change- the shape of the history context has changed from ChatMessage objects to objects that have the same request/response format as the rest of the chat agent API.

https://github.com/microsoft/vscode/pull/202541/files#diff-d005ac8b5c4b2d6ad73fbab2d3cf97636a96cf7bc11ad8339ef4d0d2dd1d15dcR18

roblourens avatar Jan 18 '24 12:01 roblourens

Breaking change: Renaming all "slash command" terms to "sub command": https://github.com/microsoft/vscode/pull/202729#event-11525797438

roblourens avatar Jan 18 '24 14:01 roblourens

Possibly breaking change- the shape of the history context has changed from ChatMessage objects to objects that have the same request/response format as the rest of the chat agent API.

https://github.com/microsoft/vscode/pull/202541/files#diff-d005ac8b5c4b2d6ad73fbab2d3cf97636a96cf7bc11ad8339ef4d0d2dd1d15dcR18

@roblourens is the reason that ChatAgentHistoryEntry doesn't look more like the typings related to follow ups, like this:

export interface ChatAgentHistoryEntry<TResult extends ChatAgentResult2> {
	request: ChatAgentRequest;
	response: ChatAgentContentProgress[];
	result: TResult;
}

Is because the history can contain entries from more than just your agent, thus the generic typing wouldn't be correct? Perhaps a typing like this could make sense?

export interface ChatAgentHistoryEntry<TResult extends ChatAgentResult2> {
	request: ChatAgentRequest;
	response: ChatAgentContentProgress[];
	result: ChatAgentResult2 | TResult;
}

Or if the above suggestion isn't accepatable (because I can still imagine other problems with it, could the JSDoc on result make it clear if any properties added on a returned ChatAgentResult2 will still be present in history/will be the exact same instance (similar to the JSDoc on ChatAgentResult2Feedback.TResult)?

MRayermannMSFT avatar Jan 18 '24 18:01 MRayermannMSFT

Yeah, but it's even a bit more complicated and maybe needs more thinking.

Before the window reload, your agent will get the same result instances that it returned. But after a window reload, when you're looking at persisted conversation history from a previous window instance, those result instances don't exist anymore, so you will only get the basic result object.

I'm trying to decide whether that's too confusing, and so we should never return the same result instance. If you can't count on it, I'm not sure it helps anyway.

roblourens avatar Jan 18 '24 19:01 roblourens

As I mentioned before, ChatAgentTask is deleted, and the progress message should be able to cover the same use case. https://github.com/microsoft/vscode/pull/202777

roblourens avatar Jan 18 '24 22:01 roblourens

I'm trying to decide whether that's too confusing, and so we should never return the same result instance. If you can't count on it, I'm not sure it helps anyway.

Random alternatives that pop into my mind

  • persist/restore (per extension) the JSON representation of things (don't be instance-true but data-true)
  • be instance-true but for active sessions and have some flag that tells extensions that "this is restored" (meaning not instance-true)
  • combine both bullets from above?

jrieken avatar Jan 19 '24 07:01 jrieken

Following a very interesting conversation with @isidorn I would like to share some thoughts I had and hopefully discuss them further.

I am finding the type ChatAccess confusing. Compared to other frameworks (LlamaIndex, SemanticKernel) this looks more like a context / service collection.

I can imagine this becoming even more powerful and enabling access to other agents, messaging api and other models (like embedding generator).

I think this code could be closer to what developers write in other platforms:

let context = await  vscode.chat.requestContext(id);
let response = await  context.llm.makeRequest(...)l

This could also be the place to plug apis to get access to chat history, previous messages, and any other information that is relevant to the current chat turn.

Having access to the conversation I think is crucial as different agents could be invoked during an engagement with the chat component and those messages could be useful in creating the arguments for the makeRequest method.

Another thing I would like to suggest is to use a type instead of string for the request content. The OpenAI sdk (but also other implementation) are preparing for more complex conversations with multi model system. In that case messages have a content property that allows multiple parts with different mime types.

colombod avatar Jan 23 '24 11:01 colombod

Some random comments:

  • the sub commands are automatically surfaced at the "global" level which could lead to sub command polution. having a flag to exclude a sub command from the global context would be nice.
  • RAG is an essential feature missing from the agent API. It's likely that any agent relying on LLM will eventually start rolling their own RAG implementation. It should not be part of the agent API per se, but it would be great to also provide access to whatever RAG the builtin copilot has access to (e.g. given a user search string, get a list of content references)
  • the chatmessagerole has "function" however it's unclear how to implement OpenAI function/tools using the current chatprovider interfaces so it's unclear what this role means. Also, in the OpenAI impl, there is a "ToolCall" vs "ToolResponse" message
  • the used content reference messages are not part of the history

pelikhan avatar Jan 31 '24 16:01 pelikhan

  • the logic that automatically adds the missing backticks for code regions seems to be defeated when the opening code region is using more then 3 backticks (this only happens while streaming).

pelikhan avatar Jan 31 '24 16:01 pelikhan

the sub commands are automatically surfaced at the "global" level which could lead to sub command polution. having a flag to exclude a sub command from the global context would be nice.

You can find a subCommand with / globally but this just inserts @agent /command. It should work with multiple subCommands on different agents that have the same name.

RAG is an essential feature missing from the agent API.

We have this in mind- we may try to somehow expose some of what @workspace can do, like via a variable. Or access to embeddings. But no details yet.

the chatmessagerole has "function" however it's unclear how to implement OpenAI function/tools using the current chatprovider interfaces so it's unclear what this role means. Also, in the OpenAI impl, there is a "ToolCall" vs "ToolResponse" message

We are talking to a service that doesn't support all of this yet, so only basic chat requests are supported right now, unless you talk to OpenAI yourself.

the used content reference messages are not part of the history

I thought it seemed that only real content messages should be part of history. Do you want the used content? Would you do something with those file entries when they appear in history?

the logic that automatically adds the missing backticks for code regions seems to be defeated when the opening code region is using more then 3 backticks (this only happens while streaming).

Fixed, thanks

roblourens avatar Jan 31 '24 19:01 roblourens

You can find a subCommand with / globally but this just inserts @agent /command. It should work with multiple subCommands on different agents that have the same name.

It is not so much a problem of name clashes, as a problem of crowdin the dropdown with a gazilion commands.

I thought it seemed that only real content messages should be part of history. Do you want the used content? Would you do something with those file entries when they appear in history?

If I know what content was used in previous queries, I can also add them to my context when executing the next LLM query.

pelikhan avatar Jan 31 '24 19:01 pelikhan

👍 Embeddings would also be useful for our extension. I tried seeing if there were other options even as far as running a local embedding server but that adds significant complexity.

InTheCloudDan avatar Feb 01 '24 20:02 InTheCloudDan

fyi - the progress-object that's passed into ChatAgentHandler will change. Instead of calling progress.report(typeA) and progress.report(typeB) there is method per display-type, like stream.markdown(value) and stream.reference(value2) etc. The new typings are here:

https://github.com/microsoft/vscode/blob/3843819e9576681b548f3c656979d76d36f2ee75/src/vscode-dts/vscode.proposed.chatAgents2.d.ts#L282

The report method and the corresponding types will be removed soon

jrieken avatar Feb 05 '24 16:02 jrieken

fyi - the progress-object that's passed into ChatAgentHandler will change.

is it already in the nightly build?

pelikhan avatar Feb 05 '24 16:02 pelikhan

not yet, it will be tomorrows insiders

jrieken avatar Feb 05 '24 16:02 jrieken

"Command followups" are gone, and have been replaced by a command method on the response stream, which renders the same kind of button (but now inline, and doesn't have to be at the end of the content)

https://github.com/microsoft/vscode/pull/204512/files#diff-d005ac8b5c4b2d6ad73fbab2d3cf97636a96cf7bc11ad8339ef4d0d2dd1d15dc

roblourens avatar Feb 06 '24 19:02 roblourens

fyi - we are going to rename ChatAccess and friends from the chatRequestAccess-proposal to "LanguageModel", e.g requestLanguageModelAccess, LanguageModelAccess, etc. The old name was chosen because we expect a language model with chat characteristics but we understand the confusing overlap with VS Code's chat feature and chat agents. This rename also happens to make it more clear that language model access can be used independent of chat agents

jrieken avatar Feb 07 '24 10:02 jrieken

The command followups change above didn't actually land in Insiders- trying again today

roblourens avatar Feb 07 '24 13:02 roblourens

fyi - ChatMessageRole#Function will be removed with https://github.com/microsoft/vscode/pull/204637

jrieken avatar Feb 07 '24 16:02 jrieken

Is there an example of providing a registerVariable resolver that functions similar to #file? When I am testing my implementation the resolver does not get called until after the request is initially submitted. Where with file it pops up the QuickPickWindow immediately, then updates the variables to include :<selectedFile>

InTheCloudDan avatar Feb 07 '24 17:02 InTheCloudDan