crewAI icon indicating copy to clipboard operation
crewAI copied to clipboard

Implemented Recursive Criticism and Iteration in Task Creation to Verify the output of the agents

Open chandrakanth137 opened this issue 1 year ago • 4 comments

RCI Documentation

Documentation for Recursive Criticism and Iteration (RCI) Methods

Overview

Recursive Criticism and Iteration (RCI) is a systematic process used to iteratively enhance the quality of outputs generated by a language model (LLM). It involves three main steps: Critique, Validate, and Improve. Each step is designed to ensure that the final output is accurate, logically sound, and free of factual errors.

This documentation explains the implementation of RCI in CrewAI library using LangChain and Ollama LLM. The code defines three methods: critique, validate, and improve. Each method leverages prompt templates to interact with the LLM and achieve the desired processing at each stage.

Methods

1. critique

This method generates a critique of the given output based on the task description. The critique focuses solely on logical or factual inaccuracies, avoiding grammatical rephrasing or paraphrasing.

Parameters:

  • agent: The agent performing the task, which includes the agent's backstory.
  • task: The task description provided to the model.
  • output: The output generated by the LLM for the given task.
  • llm: The language model used to process the prompt and generate the critique.

2. validate

This method determines if the critique suggests significant changes to the output. It analyzes the critique to see if it indicates that substantial revisions are necessary.

Parameters:

  • task: The task description provided to the model.
  • critique: The critique generated by the critique method.
  • output: The original output generated by the LLM.
  • llm: The language model used to process the prompt and validate the critique.

Returns:

  • validate_response: A single word response, either "True" or "False", indicating whether significant changes are required.

3. improve

This method refines the original output based on the critique. It rewrites the output to address the errors identified in the critique, ensuring the format specified in the task description is maintained.

Parameters:

  • task: The task description provided to the model.
  • output: The original output generated by the LLM.
  • critique: The critique generated by the critique method.
  • llm: The language model used to process the prompt and improve the output.

Returns:

  • improve_response: The improved output generated by the LLM.

Summary

The RCI methods (critique, validate, and improve) form a robust framework for iteratively refining LLM outputs. By identifying and correcting logical or factual errors and validating the significance of required changes, this approach ensures high-quality and accurate outputs tailored to specific task requirements.

To test the functioning RCI in the modified CrewAI

Note: The test code written is suited for local development scenario, kindly modify the code in necessary ways for your working.

  1. Move the src directory
  2. Run the crew_ai_base_test.py to check if your initial setup works
  3. Run this Python notebook for advanced test case: YT_Email_Reply_Llama3_CrewAI_+_Groq.ipynb
    • For the above code to work, Groq API is required, make sure you have one before starting
  4. To try out custom test cases, RCI can be enabled while creating an instance of the Task Class
    • Set rci=True if you want to use RCI, the default value is True.
    • The # of iterations can be modified using the rci_depth parameter, which takes an integer value. The default value is set to 1.

Example:

task = Task(
    description= """ Your Task Description""",
    agent= your_agent,
    expected_output= "However you wish",
    rci=True
    rci_depth=3
)

chandrakanth137 avatar Jun 18 '24 10:06 chandrakanth137

Just got my eyes on this! Very curious about it, little busy today/tomorrow, but bumping this to the top of the list so either myself or someone on the team looks at it!

joaomdmoura avatar Jun 27 '24 05:06 joaomdmoura

As @mbarnathan pointed out to make the code more generic, I have modified the code to use the existing LLM from the user provided which would usually be initialized using crew.kickoff().

As I have access only to local LLMs, it would be great if anyone can run the internal tests. I tried using the local LLMs but it eventually came to do modifying the test cases to setup for local LLMs.

I will probably try to modify the test cases to even work with local LLMs.

chandrakanth137 avatar Jun 27 '24 06:06 chandrakanth137

This PR is stale because it has been open for 45 days with no activity.

github-actions[bot] avatar Sep 25 '24 12:09 github-actions[bot]

Hi @chandrakanth137 thanks for the PR.

I just wanted to touch base about the project—there have been quite a few changes since June, like removing poetry.lock and Langchain, among others. It would be great to see these updates reflected in a new pull request.

If this is still on your radar, just let us know how we can help make it happen!

Thanks!

pythonbyte avatar Dec 09 '24 13:12 pythonbyte

@pythonbyte Apologies for the late reply. I am currently working on this and would send a new PR soon. I am closing this PR as I will send a new one soon.

chandrakanth137 avatar Jan 14 '25 10:01 chandrakanth137