AutoGPT Semi-Active Mode

Semi-Active Mode

Open slavakurilyak opened this issue 1 year ago • 7 comments

Summary

I propose a new feature called "Semi-Active Mode," which enables the AI to run in a semi-automated manner, seeking user assistance when it encounters uncertainty, confusion, or ambiguity. This feature combines the benefits of Continuous Mode with the human-in-the-loop experience of Active Mode.

Background

Currently, there are two modes available:

"Continuous Mode," which allows the AI to run without user authorization and is 100% automated. However, this mode is not recommended due to its potential dangers, such as running indefinitely or performing actions that users might not approve.
"Active Mode," which enables the AI to run while actively prompting the user with chain-of-thought questions when executing each subsequent action. This allows users to actively participate while the AI agent runs, ensuring a human-in-the-loop experience.

To further enhance user engagement and provide a more flexible experience, I propose a new feature called "Semi-Active Mode."

Feature Description

In "Semi-Active Mode," the AI will:

Execute an action.
Evaluate its confidence in the action or result.
If the confidence is below a certain threshold, prompt the user for assistance or clarification.
Incorporate the user's input and continue to the next action.

This interaction pattern allows users to assist when needed while still benefiting from the AI's capabilities. It strikes a balance between full automation and active participation, fostering a collaborative environment between the user and the AI system.

Example Implementation

Here's an example implementation using LangChain's Human-As-A-Tool feature:

import sys
from langchain.chat_models import ChatOpenAI
from langchain.llms import OpenAI
from langchain.agents import load_tools, initialize_agent
from langchain.agents.agent_types import AgentType

llm = ChatOpenAI(temperature=0.0)
math_llm = OpenAI(temperature=0.0)
tools = load_tools(
    ["human", "llm-math"], 
    llm=math_llm,
)

agent_chain = initialize_agent(
    tools,
    llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True,
)

agent_chain.run("What is Eric Zhu's birthday?")

In this code, the AI agent seeks human assistance when it encounters uncertainty, allowing the user to guide as needed.

Benefits

Enhanced user engagement
Reduced risk of AI performing unwanted actions
Increased collaboration between the user and the AI
Balances automation and user control

Risks and Mitigations

This feature may slow down the overall AI operation due to the need for user input in certain situations. However, this trade-off is acceptable, considering the increased control and collaboration it provides.

Request for Comments

I would appreciate feedback from the community on this suggested feature. Please share your thoughts, suggestions, and any potential concerns you may have.

Apr 04 '23 05:04 slavakurilyak

we're going to need some prompt engineering to ask the ai about its confidence. in the main prompt file we can add: 2. Constructively self-criticize your big-picture behavior constantly and evaluate your confidence level.

{
    "command": {
        "name": "command name",
        "args":{
            "arg name": "value"
        }
    },
    "thoughts":
    {
        "text": "thought",
        "reasoning": "reasoning",
        "plan": "- short bulleted\n- list that conveys\n- long-term plan",
        "criticism": "constructive self-criticism",
        "speak": "thoughts summary to say to user",
        "confidence": "0-100 confidence level rating"
    }
}

Please provide better prompt options that might cost less token cc @Torantulino

Remark: this mode can enrich the continuous mode too: if the AI is not confident, maybe it can retry the same action. I would say this should probably be a separate continuous mode though (continuous-with-reflection?) because this is going to cost more tokens.

Apr 04 '23 06:04 waynehamadi

Some time ago I read a paper discussing AI control and compliance. (I can't find this paper right now). It was proposing an "approval seeking" mode for AI. where it works autonomously on goals it clearly understands and knows how to achieve. When there is a case for ambiguity, it seeks for a superior (human?) guidance. There are 2 types of engagement

ask for next step (what should I do next)
ask for permission (this is what I think I can do, may I? )

..I put this comment just for context...

Apr 04 '23 10:04 profintegra

The best experience with reports is when they can read all the source materials and try to figure it out themselves, but still come and present an awesome, intelligent summary to me and want to discuss their plans to check if they match my understanding. Typically we discuss until it's clear we're in agreement that this is a reasonable and valuable course of action.

Apr 06 '23 05:04 sberney

I would love to collaborate on this and have been working on a framework for classifying the risk and type of AI tasks for proper delegation:

The TACTIC framework provides a comprehensive approach to managing AI commands, covering a wide range of risks and approval levels. By implementing this tiered structure, we can maintain control over AI-driven processes while maximizing efficiency, security, and accountability.

Tier 1: Transparent (Low risk, read-only tasks) Examples: Browsing the web, searching for information, reading documents. Approval: Automatic or other AI bots.

Tier 2: Assisted (Low to medium risk, write-access tasks) Examples: Drafting and updating documents, spreadsheets, presentations. Approval: Low-level human intervention, such as a delegated assistant.

Tier 3: Collaborative (Medium risk, communication tasks) Examples: Sending emails, making phone calls, scheduling meetings. Approval: Mid-level human intervention, such as a designated supervisor.

Tier 4: Transactional (Medium to high risk, financial tasks) Examples: Using paid APIs, ordering items, making purchases. Approval: High-level human intervention, such as a manager or financial officer.

Tier 5: Intimate (High risk, sensitive tasks) Examples: Accessing sensitive data, making critical decisions, handling confidential information. Approval: Exclusive to the user/owner themselves.

Tier 6: Critical (Extremely high risk, irreversible or high-impact tasks) Examples: Initiating legal actions, making large financial investments, approving strategic partnerships. Approval: Highest level of human intervention, such as a board of directors or executive committee.

I think we need a hybrid approach to classifying commands that combines AI with human involvement to provide a more reliable solution. So initial classification by GPT but a human-in-the-loop review process, especially for tasks that fall under higher risk categories. This step ensures that the AI categorization is accurate and relevant, providing an additional layer of validation.

Apr 24 '23 02:04 marktsears

See also: https://github.com/Significant-Gravitas/Auto-GPT/issues/3396#issuecomment-1529504806

May 01 '23 19:05 Boostrix

One thing I think myself and a lot of others have encountered is wanting to "step in" to continuous mode, make some instructions, and turn it back on. Same thing for choosing some number of automated steps, being able to choose like 50, see something going wrong, give some guidance and turn back over control.

May 03 '23 01:05 montanaflynn

AutoGPT AutoGPT copied to clipboard

Semi-Active Mode

Summary

Background

Feature Description

Example Implementation

Benefits

Risks and Mitigations

Request for Comments

AutoGPT
AutoGPT copied to clipboard