semantic-kernel icon indicating copy to clipboard operation
semantic-kernel copied to clipboard

Iterative planner

Open kaza opened this issue 2 years ago • 7 comments

Motivation and Context

This change implements Agent pattern and for now it implements this https://arxiv.org/pdf/2205.00445.pdf

The major difference to existing planners is that it calls LLM after each step to decide what to do next, this way it behaves better for complex tasks where all required steps can not be easily predicted.

I had conversation with @alexchaomander in discord

over the next days I would like to :

  1. add some integrations tests (so that we can change things and know it still works)
  2. currentll its only tested with ITextCompletion, I want to add classes for IChatCompletion
  3. add more agent types https://python.langchain.com/en/latest/modules/agents/agents/agent_types.html
  4. create something like baby-agi https://github.com/yoheinakajima/babyagi

My questions are :

  • in which namespace it should reside?
  • what shold be the name of the project ? I belive the pattern is generally called "Agent" so I would reneme it to Agent if you guys agree
  • I have craeted LanguageCalculatorSkill, which was really useful. It can take a math problem and calculate it using ncalc. It got several thumbs up in discord, so I was thinking of adding this skill as well, just tell me if this is meaningfull and if so - where should I put it ? should I maybe change Ncalc to something else ?

Description

Attempt to implement MRKL systems as described in https://arxiv.org/pdf/2205.00445.pdf strongly inspired by https://github.com/hwchase17/langchain/tree/master/langchain/agents/mrkl

Contribution Checklist

  • [X] The code builds clean without any errors or warnings
  • [X] The PR follows SK Contribution Guidelines (https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md)
  • [X] The code follows the .NET coding conventions (https://learn.microsoft.com/dotnet/csharp/fundamentals/coding-style/coding-conventions) verified with dotnet format
  • [X] All unit tests pass, and I have added new tests where possible I definetly want to add few integration tests
  • [x] I didn't break anyone :smile:

kaza avatar May 15 '23 10:05 kaza

This is great, will be glad to help with a closer review when appropriate. Some thoughts below on your questions:

in which namespace it should reside?

For IterativePlanner, I'd say namespace Microsoft.SemanticKernel.Planning. The rest, Microsoft.SemanticKernel.Planning.Iterative. Folks might have opinions on naming options (Iterative|Progressive|Hybrid|MRKL|etc. -- see what GPT thinks :D) but otherwise makes sense to me.

what shold be the name of the project? I belive the pattern is generally called "Agent" so I would reneme it to Agent

I'd throw a vote out for MRKL -- I'll let others chime in here, too. GPT Suggested Hybrid.

I have craeted LanguageCalculatorSkill, which was really useful. It can take a math problem and calculate it using ncalc. It got several thumbs up in discord, so I was thinking of adding this skill as well, just tell me if this is meaningfull and if so - where should I put it ? should I maybe change Ncalc to something else ?

I'd add that as part of a separate PR in it's own project in dotnet\src\Skills\ -- Maybe Skills.MathSkill.

lemillermicrosoft avatar May 15 '23 19:05 lemillermicrosoft

@lemillermicrosoft I was thinking to make a jump to Agents, as root namespace: Microsoft.SemanticKernel.Agents, And than inside provide the implementation for : Iterative|Progressive|Hybrid|MRKL| - Agents

What is the reason of keeping each Planner in the separate project ? I would put them all together in a single project/library, and the project name would be SemanticKernel.Agents, but that is your call

Also any suggestions what I should do with LanguageCalculatorSkill ? I am ok with dropping (keeiping it in my projects only) but its really nice and helps :), for that I just need destination and the place for integration tests (but that will probably follow)

kaza avatar May 16 '23 06:05 kaza

Today I tried to make same prompt / semantic function work with chat and text completion APIs, but making it work with one was breaking the another. Maybe it is simply above my competency (if anyone has good idea how to it I would be glad to incorporate it).

But also I belive that if we are trying to make it more generic and work on both models, we will loose accuracy, and it seams to me that is not a good trade off (if you disagree please let me know).

I belive that reliability / accuracy are more important than having only one class. Users can decide on the model and agent for it and configure everything on the begining no need to be that generic.

I will continue with implentation of separate classes (and add Chat sufix to the Chat optimized agent). if anonye disagrees please let me know.

kaza avatar May 16 '23 11:05 kaza

Hi Comunity,

I belive I have a good first candidate, so please let me know what else do I need to adapt / fix. I added two integration tests for each agent just to make sure that mecanics are working properly, and I tested for the things which were usually breaking. If we agree on direaction and requireiements I will be glad to provide more tests. Also let me know if you would like to do the tests differently.

I was looking for the next steps, and inspiration from here https://python.langchain.com/en/latest/modules/agents/agents/agent_types.html and I was looking into the docstore but I was not able to find a good wikipedia implementation in .net core, if you have any suggestions on that let me know.

Also if you have any suggestions which agent/ planer should be next to implement, I can give it a shot over the next week.

let the review and / merge begin.

kaza avatar May 20 '23 08:05 kaza

Hi Comunity,

I belive I have a good first candidate, so please let me know what else do I need to adapt / fix. I added two integration tests for each agent just to make sure that mecanics are working properly, and I tested for the things which were usually breaking. If we agree on direaction and requireiements I will be glad to provide more tests. Also let me know if you would like to do the tests differently.

I was looking for the next steps, and inspiration from here https://python.langchain.com/en/latest/modules/agents/agents/agent_types.html and I was looking into the docstore but I was not able to find a good wikipedia implementation in .net core, if you have any suggestions on that let me know.

Also if you have any suggestions which agent/ planer should be next to implement, I can give it a shot over the next week.

let the review and / merge begin.

Awesome work! The team will do a more in-depth review this week.

alexchaomander avatar May 21 '23 16:05 alexchaomander

Hi Comunity,

I belive I have a good first candidate, so please let me know what else do I need to adapt / fix. I added two integration tests for each agent just to make sure that mecanics are working properly, and I tested for the things which were usually breaking. If we agree on direaction and requireiements I will be glad to provide more tests. Also let me know if you would like to do the tests differently.

I was looking for the next steps, and inspiration from here https://python.langchain.com/en/latest/modules/agents/agents/agent_types.html and I was looking into the docstore but I was not able to find a good wikipedia implementation in .net core, if you have any suggestions on that let me know.

Also if you have any suggestions which agent/ planer should be next to implement, I can give it a shot over the next week.

let the review and / merge begin.

I will pull these changes and gather feedback/thoughts in the next week or so, so we can help move this forward.

lemillermicrosoft avatar May 23 '23 19:05 lemillermicrosoft

let me know if there is anything I can do upfront to help you out

kaza avatar May 24 '23 07:05 kaza

I think a key requirement of being a Planner is that Plan object is created, either semantically or otherwise. I definitely think having this iterative/mrkl approach is worth incorporating. For now, it might be best to file an issue to focus on scoping things (separate out languagecalculator for example, I don't think that's required for the planning functionality). For now, I'm going to mark things in draft state for clarity on the team.

Other notes:

  • Shouldn't need to separate out into chat and and text variants, but maybe I'm missing something here.

lemillermicrosoft avatar May 31 '23 19:05 lemillermicrosoft

For sake of visibility, some proposed changes can be seen in the following commit (and others in the branch it's contained in).

https://github.com/lemillermicrosoft/semantic-kernel/commit/88a7379b36621d681107ac6cfe13f24e7b471d1f

lemillermicrosoft avatar May 31 '23 22:05 lemillermicrosoft

Hey @kaza -- I've taken a stab at breaking up the various changes into smaller PRs. Let me know if you have preference/desire in proceeding forward with this PR or not. There definitely could be room for a few types of patterns here that are similar. I hope this helps. #1472 has all the linked PRs cc @alexchaomander

lemillermicrosoft avatar Jun 14 '23 01:06 lemillermicrosoft

Closing for now. Please don't hesitate to re-open as appropriate @kaza

lemillermicrosoft avatar Jun 14 '23 21:06 lemillermicrosoft