OpenHands icon indicating copy to clipboard operation
OpenHands copied to clipboard

Microagents and Delegation

Open rbren opened this issue 1 year ago • 5 comments

The goal of this PR is to add support for creating small, simple agents using nothing but markdown, yaml, and some templating.

I don't know that this will be a fruitful path, but I think it's one worth exploring.

Broadly, the idea is that we can leverage the OpenDevin community to create 100s of purpose-built agents for narrow tasks, like:

  • writing a Postgres database migration
  • writing a Cypress test
  • writing python tests
  • modifying a readme
  • finding the right format for a function call
  • debugging an error message
  • etc etc

If the user wants to accomplish one of these small tasks, they could simply pick the right agent. E.g. I want to write a cypress test, so I pick the cypress agent.

But more powerfully, we could have a MetaAgent (or several competing MetaAgents) which, given a task, can select one or more microagents to accomplish it.

In service of the latter case, I've created an AgentDelegateAction, which allows an agent to farm out a task to a subagent. I've also created the Manager agent, which does nothing but Delegate. You can see it delegate to the MathAgent with this command:

poetry run python opendevin/main.py -t "What is the area of a circle with radius 7.324 inches?" -c Manager

I think there's a lot of room for improvement on delegation. Right now:

  • An AgentController can start a delegate AgentController
    • this has its own state (task, history, etc)
  • The AgentController pushes all step calls to the delegate
  • When the delegate sends a FinishAction, the main controller takes back over
  • Hypothetically, you could have an unlimited number of sub-delegates

In the future, it'd be very cool if you could run multiple concurrent delegates, so the agent could accomplish different tasks in parallel.

rbren avatar Apr 19 '24 19:04 rbren

should we create a repo to store these official agents? it's okay to place inside the same repo for now, more like built-in agents. it will make sense to let the community develop agents if we can define the specifications. wdyt?

iFurySt avatar Apr 20 '24 06:04 iFurySt

My two cents:

should we create a repo to store these official agents?

I used to maintain an open-source org that follows a similar structure: main repo where the core code sits, and another repo containing all light-weighted "plugins" (similar to the micro-agents here). My experience is when you have more than one repo, the dynamics and momentum spread over multiple places and eventually, one or more repos (usually non-main repo) lose maintenance. This is not uncommon in community-driven open-source projects. Those funded by companies are different stories, though.

Another less convincing but interesting counter-argument: if we ever decide to donate this project to Apache foundation, every repo has to sit under Apache umbrella on its own. It would be hard to find those non-main repos under Apache org.

That being said, I do think I get your motivation, especially this:

it will make sense to let the community develop agents if we can define the specifications

Sounds very exciting to me. Non-trivial but it would be great if micro-agents are pluggable. Maybe a cmdline arg "--custom_agents_path" that allows OpenDevin to search and "install".

li-boxuan avatar Apr 20 '24 06:04 li-boxuan

Yeah I think you're both onto something! If this grows past a certain point, we could probably decentralize it, like npm--anyone can publish an agent just by making a public git repo.

rbren avatar Apr 20 '24 13:04 rbren

This narrow approach is what works reliably in my experience with current LLMs and agents. Also potential to externalize and reduce core bloat. I like it.

kloudsamurai avatar Apr 22 '24 23:04 kloudsamurai

Thanks @li-boxuan! Appreciate your attention to detail on the prompts!

rbren avatar Apr 23 '24 12:04 rbren

In my understanding, the current microagents structure is: coder & math_agent = Software Developers/Engineers, manager = Development Manager & Project Manager , postgress_agent = Product Manager, verifier = QA Engineers & IT Support.

Adding a few more agents will definitely make it more robust especially a diplomat agent whose job is to interact between the user and manager more effectively.

Below are some samples of these proposed agents:

  • Diplomat Agent Task Prompt: Task You are a Diplomat Agent. Your key responsibility is to interact with the user to gather all essential details and clarify the requirements for the project:

{{ state.plan.main_goal }}

You must:

Engage with the user to understand the specifics of their request and any unique requirements they have. Clarify any ambiguities or incomplete information concerning the project goal. Ask targeted questions that help refine the project scope and objectives. Ensure that the information is precise and complete enough for other agents to execute their tasks effectively without further clarification. Summarize and clarify the information received to ensure that all technical requirements are well understood. Present the refined project goals and requirements to the Manager Agent, ensuring they have all necessary details to delegate tasks effectively.

  • Software Architects Agent Task Prompt: Task You are a Software Architect. Your main goal is to design a robust software architecture for a new project that involves integrating multiple systems:

{{ state.plan.main_goal }}

You must:

Analyze the requirements and propose a high-level architecture that addresses scalability, reliability, and integration. Select appropriate technologies and frameworks to implement the proposed architecture. Create architectural diagrams and documentation to guide the development team. Review and adjust the architecture based on feedback from user.

  • UI/UX Designers Agent Task Prompt: Task You are a UI/UX Designer tasked with improving the user experience of our product:

{{ state.plan.main_goal }}

You must:

Conduct user research to understand the needs and behaviors of our users. Develop wireframes and prototypes that reflect the findings of your research. Test your designs with real users or through usability testing platforms. Iterate on your designs based on user feedback and usability testing results.

  • DevOps Engineers Agent Task Prompt: Task You are a DevOps Engineer. Your mission is to streamline the software deployment process for an ongoing project:

{{ state.plan.main_goal }}

You must:

Evaluate the current CI/CD pipeline and identify areas for improvement. Implement automation tools to speed up the deployment process while ensuring reliability. Monitor and optimize the infrastructure for performance and cost-effectiveness. Ensure that security best practices are integrated into the deployment processes.

  • Security Specialists Agent Task Prompt: Task You are a Security Specialist. You are responsible for ensuring the security of the software:

{{ state.plan.main_goal }}

You must:

Conduct a security audit of the existing systems and identify vulnerabilities. Propose and implement security measures to address identified risks. Develop a comprehensive security protocol and ensure compliance with industry standards.

rezzie-rich avatar Apr 26 '24 00:04 rezzie-rich

@rezzie-rich feel free to open a PR!

rbren avatar May 01 '24 15:05 rbren