Feature: Implement recursive criticism and iteration process in Task model
Documentation for Recursive Criticism and Iteration (RCI) Methods
Overview
Recursive Criticism and Iteration (RCI) is a systematic process used to iteratively refine AI-generated outputs to enhance their accuracy and alignment with expectations. This process involves three key steps: Critique, Rectify, and Iterate. Each step ensures that the output is logically sound, relevant, and free of factual inconsistencies.
This documentation outlines the implementation of RCI in the CrewAI library, leveraging LangChain-based agents. The implementation consists of three primary methods: critique, rectify, and critique_and_iterate. These methods work in sequence to evaluate, refine, and finalize AI-generated outputs.
Methods
1. critique
This method analyzes an AI-generated output and provides constructive critiques to improve alignment with the given task description and expected output. The critique process strictly avoids minor stylistic or paraphrasing changes and focuses solely on substantive discrepancies.
Parameters:
- agent (BaseAgent): The agent responsible for processing the task.
- current_output (str): The output currently under evaluation.
- description (str): A detailed description of the task.
- expected_output (str): The anticipated outcome of the task.
- context (Optional[str]): Additional contextual information relevant to the task.
- tools (Optional[List[BaseTool]]): A set of tools the agent may utilize during critique.
Returns:
- critique_response (str): A critique detailing any inconsistencies, errors, or missing elements in the output. If the output is deemed satisfactory, the response will be exactly:
NO ISSUES FOUND.
2. rectify
This method refines the AI-generated output based on critiques provided by the critique method. It modifies the output only to resolve identified issues, preserving all other aspects of the original response.
Parameters:
- agent (BaseAgent): The agent responsible for processing the rectification task.
- critique_response (str): The critique feedback from the
critiquemethod. - current_output (str): The output to be refined.
- description (str): A description of the task requirements.
- expected_output (str): The expected final output.
- context (Optional[str]): Additional relevant task context.
- tools (Optional[List[BaseTool]]): Tools available for use in rectification.
Returns:
- rectified_output (str): A revised version of the output addressing the critiques.
3. critique_and_iterate
This method manages the full recursive process of critique and rectification until the output meets expectations or reaches a predefined maximum iteration count (rci_max_count).
Parameters:
- agent (BaseAgent): The agent responsible for iterative refinement.
- initial_output (str): The first AI-generated output.
- description (str): A description of the task.
- expected_output (str): The anticipated outcome of the task.
- context (Optional[str]): Contextual information for task execution.
- tools (Optional[List[BaseTool]]): Tools available for critique and refinement.
Returns:
- final_output (str): The refined output after iterative processing.
The method performs multiple iterations of critique and rectification up to rci_max_count. If the critique response is NO ISSUES FOUND, the iteration halts, and the current output is returned as the final output.
Task Class Enhancements
The Task class has been updated to support the RCI process. Key attributes include:
- rci (bool): Enables or disables the RCI process.
- rci_max_count (int): Sets the maximum iterations for RCI.
- description & expected_output: Provide detailed task definitions for accurate critique.
- agent & tools: Ensure the execution environment has the necessary resources.
Summary
The RCI framework (Critique, Rectify, and Iterate) forms a structured method to refine AI-generated outputs efficiently. By continuously evaluating and improving responses, the system ensures alignment with task expectations while reducing errors. This method enhances AI reliability in complex tasks and structured workflows.
The tests cases will fail for few cases because there are some assertations that count the number of the _execute() function is being called and using RCI will increase the count.
Disclaimer: This review was made by a crew of AI Agents.
Code Review Comment for PR #2240: Recursive Criticism and Iteration Process
Overview
This pull request introduces a new Recursive Criticism and Iteration (RCI) feature in the Task model, enabling automated quality checks and iterative improvements to task outputs. Below are my findings and suggestions based on the implementation.
Detailed Analysis
1. New Field Additions
The addition of rci and rci_max_count fields is a positive enhancement, allowing for configurable iterations of the RCI process.
Suggested Improvement:
- Input Validation: Add validations to
rci_max_countto ensure no negative numbers are accepted. Additionally, consider imposing a maximum limit on iterations to prevent excessive use.rci_max_count: int = Field( default=1, ge=1, # Ensure it's at least 1 le=5, # Limit to a maximum of 5 iterations description="Number of iterations to run the RCI process (1-5).", )
2. Critique and Iterate Implementation
The implementation in critique_and_iterate effectively integrates the recursive process. However, a few improvements can enhance safety and clarity:
Issues Identified:
- Missing return type hints in method signatures.
- Inadequate error handling which could lead to unhandled exceptions during runtime.
- Potential side effects caused by mutating
self.descriptionandself.expected_output.
Suggested Improvements:
- Add Type Hints and Error Handling:
def critique_and_iterate( self, agent: BaseAgent, initial_output: str, description: str, expected_output: str, context: Optional[str], tools: Optional[List[BaseTool]] ) -> str: # Specify return type - State Management: Ensure original values are restored correctly after usage to prevent side effects.
3. Critique Method
The critique method generates feedback but has room for enhancements:
Issues Identified:
- Hardcoded prompt templates may reduce flexibility.
- There is no format validation for the critique responses.
- Missing log statements can hinder debugging efforts.
Suggested Improvement:
- Incorporate Logging:
logger.debug(f"Starting critique for task: {self.name}")
4. General Recommendations
- Testing: Ensure unit tests for the RCI process are added to verify functionality across various scenarios, including edge cases.
- Documentation Enhancements: Expand docstrings with usage examples and document possible failure modes, clarifying the configuration guidelines for
rci_max_count. - Performance Optimizations: Consider implementing caching for repeated critiques and develop a timeout mechanism to manage long-running iterations.
- Monitoring and Logging: Introduce logging for success rates and failure metrics to facilitate tracking of improvement percentages over time.
Conclusion
Overall, the RCI feature provides a structured approach to improving task outputs through iteration. However, the implementation requires further refinements to improve robustness, maintainability, and overall performance in production scenarios.
Emphasizing these improvements will not only solidify the feature's reliability but also enhance the user experience and developer maintainability in the long run. Thank you for your efforts on this significant enhancement!
The chat bot reviewed changes have been made in the recent commits
Hey @chandrakanth137!
Could you please share a demo of this feature running and show how the output is improved by using this feature?
Also, it looks like there is a conflict in the task.py
@bhancockio The conflict has been resolved. Regarding the test cases, RCI has use cases when using smaller LLMs, especially the ones which struggles to reason. Is there any specific kind of test cases that you would want me to demo ?
For example: Recent LLMs can solve the question, How many r's in the word "Strawberry" ?, but small LLMs like Llama3:4b and Llama3.2:1b, Llama3.2:3b struggle to answer this question correctly, but using RCI it was able to answer the question correctly.
This PR is stale because it has been open for 45 days with no activity.
@chandrakanth137 Would you mind to remove no-related commits from this PR?
@lucasgomide Okay, is it okay if I post it as a new PR instead of editing this PR ?
It's ok