ag2 icon indicating copy to clipboard operation
ag2 copied to clipboard

Agent orchestrator/ aligner with custom speaker selection

Open codeloki15 opened this issue 4 months ago • 8 comments

Why are these changes needed?

This PR introduces improvements to the multi-agent orchestration pipeline.
Specifically:

  • Added an Agent Aligner module to enforce strict workflow order (Plan → Confirm → Write → Confirm → Execute).
  • Enhanced error handling and rollback logic to return safely to planner/debugger when failures occur.

These changes solve the problem of uncoordinated agent execution and reduce risks of redundant calls, unsafe code execution, or workflow drift.

codeloki15 avatar Aug 21 '25 19:08 codeloki15

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Aug 21 '25 19:08 CLAassistant

@randombet up for review. Thanks

codeloki15 avatar Aug 21 '25 19:08 codeloki15

@codeloki15 Thanks for the contribution!! Could you run pre-commit: https://docs.ag2.ai/latest/docs/contributor-guide/pre-commit/

qingyun-wu avatar Aug 23 '25 04:08 qingyun-wu

@qingyun-wu and @randombet thanks for the comment, I have pushed the changes after pre-commit no error persist.

codeloki15 avatar Aug 23 '25 07:08 codeloki15

@codeloki15 could you add a code block about the installation info, e.g., are you using pip install ag2[openai]? Any other additional options than openai? Should be ready to merge once this is added.

qingyun-wu avatar Aug 23 '25 23:08 qingyun-wu

@qingyun-wu Thanks .. pushed the changes.

codeloki15 avatar Aug 24 '25 14:08 codeloki15

PR Review: Agent Orchestrator/Aligner with Custom Speaker Selection

Summary

This PR introduces a comprehensive multi-agent orchestration notebook. The concept is solid and educational, but there are several critical issues that need to be addressed before merging.


Critical Issues (Blockers)

1. Hardcoded Local File Paths

Location: Cells 2, 4, 10

The notebook contains hardcoded paths like:

  • /Users/lokesh/Desktop/code/config/groq_gpt_oss_120.json
  • /Users/lokesh/Desktop/code/MCP_test/software_eng/files/

Impact: Will fail for all users who don't have these exact paths.

Fix: Use autogen.LLMConfig.from_json(path="OAI_CONFIG_LIST") pattern like other notebooks.

2. Missing Setup Instructions

Need MDX-formatted installation and configuration section at the beginning (see agentchat_groupchat.ipynb for example).

3. Typos

  • Cell 10: "form file path" → "from file path"
  • Cell 6: "Proccesscompletion_Agent" → "process_completion"

4. Inconsistent Agent Naming

  • Agent defined as code_writer but referenced as "CodeWriter"
  • agent_aligner references non-existent "code_writer_agent" and "code_executor_agent" This will cause runtime errors.

Security Concerns

5. Code Execution Without Oversight

  • human_input_mode="NEVER" removes all human oversight
  • No Docker usage or warnings
  • Agents instructed to use subprocess.popen for pip install (arbitrary package installation risk)

Fix: Add security warnings, consider use_docker=True, change to human_input_mode="TERMINATE"


Other Issues

Performance

  • max_round=500 is excessive and expensive. Consider 20-30.
  • temperature=0.15 may be too rigid. Document the choice.

Code Quality

  • Unused variable: today_date = datetime.now()
  • Magic numbers without explanation (timeout=3000, max_consecutive_auto_reply=5)
  • Very long system message in agent_aligner (100+ lines, high token cost)

Architecture

  • String matching for routing ("planner_agent" in content) is fragile
  • Should use structured output/Pydantic as mentioned in notebook
  • Missing error handling for malformed responses

Testing

  • No unit tests for custom_speaker_selection_func
  • No integration tests for workflow
  • No validation of routing logic

Positive Aspects

  1. Excellent educational documentation with clear tables and diagrams
  2. Well-defined Plan → Write → Execute workflow pattern
  3. Good error handling with Debugger agent loop
  4. Innovative Agent Aligner pattern
  5. Comprehensive explanations make complex concepts accessible

Recommendation

REQUEST CHANGES - Please address the 5 blocker issues above before this can be merged.

The core concept is valuable and will benefit the community once these issues are resolved. Thank you for this contribution!


Reviewed by: Claude Code
Date: 2025-10-30

github-actions[bot] avatar Oct 30 '25 18:10 github-actions[bot]

Re-Review: Agent Orchestrator/Aligner with Custom Speaker Selection

Status: CHANGES STILL REQUIRED ⚠️

I've reviewed this PR as a follow-up to the previous Claude Code review from October 30, 2025. Unfortunately, the critical blocking issues identified in that review have not been addressed. The PR cannot be merged in its current state.


🚨 CRITICAL ISSUES (Must Fix Before Merge)

1. Hardcoded Local File Paths - BLOCKING

Locations: Lines 125, 133, 488

The notebook contains hardcoded paths that won't work for other users:

  • /Users/lokesh/Desktop/code/config/groq_gpt_oss_120.json
  • /Users/lokesh/Desktop/code/MCP_test/software_eng/files/
  • /Users/lokesh/Desktop/code/Newton/Newton_data/data/dermatology.csv

Required Fix:

  • Use autogen.LLMConfig.from_json(path="OAI_CONFIG_LIST") pattern
  • Use relative paths or environment variables for data files
  • Provide sample data or download instructions

2. Inconsistent Agent Naming - BLOCKING (Runtime Error)

Location: Lines 310-315, 345-350

Agent defined as CodeWriter but routing logic looks for code_writer_agent. This will cause KeyError at runtime.

Required Fix: Make agent names consistent throughout

3. Typos - BLOCKING

  • Line 488: "form file path" → "from file path"
  • Line 350: "Proccesscompletion_Agent" → "ProcessCompletion_Agent"

4. Missing Setup Instructions - BLOCKING

Need MDX-formatted installation section like other notebooks

5. Missing LLM Configuration Docs - BLOCKING

Need to document how to set up OAI_CONFIG_LIST


🔒 SECURITY CONCERNS

6. Unsafe Code Execution

  • human_input_mode="NEVER" removes all human oversight
  • No Docker usage
  • Agents instructed to use subprocess.popen for arbitrary pip installs

Recommended: Change to human_input_mode="TERMINATE" and add security warnings


⚠️ HIGH PRIORITY ISSUES

7. Excessive max_round=500

This is extremely high and could cause high costs and long execution times. Recommend 20-30.

8. Unused Variable

Line 115: today_date = datetime.now() - never used

9. Magic Numbers

Lines 131, 163, 169 have unexplained values (temperature=0.15, timeout=3000, etc.)


📊 ARCHITECTURAL CONCERNS

10. Fragile String-Based Routing

The custom speaker selection relies on string matching which is brittle. Consider implementing Pydantic structured outputs as mentioned in the notebook.

11. Very Long System Message

The agent_aligner system message is 110+ lines, consuming many tokens per request.


🧪 TESTING CONCERNS

12. No Tests

Missing unit tests for custom_speaker_selection_func and integration tests for the workflow.


✅ POSITIVE ASPECTS

  1. Excellent Documentation - Outstanding educational content with tables and diagrams
  2. Clear Architecture - Well-defined Plan → Write → Execute workflow
  3. Innovative Pattern - Agent Aligner concept is valuable
  4. Comprehensive Explanations - Makes complex concepts accessible
  5. Error Handling - Good use of Debugger agent

📝 REQUIRED CHANGES SUMMARY

Before merge, you MUST:

  1. Remove all hardcoded local paths
  2. Fix agent naming inconsistencies
  3. Fix all typos
  4. Add proper setup/installation instructions
  5. Add LLM configuration documentation
  6. Address security concerns or add warnings
  7. Reduce max_round to 20-30
  8. Remove unused variables

Recommended:

  • Add unit tests
  • Implement Pydantic structured outputs
  • Document magic numbers
  • Add Docker usage examples

🎯 RECOMMENDATION

REQUEST CHANGES - The core concept is excellent, but critical issues must be resolved before merging.


🤖 Re-reviewed by: Claude Code
📅 Date: 2025-11-26
🔗 Previous Review: October 30, 2025

github-actions[bot] avatar Nov 26 '25 20:11 github-actions[bot]

Third Review: Agent Orchestrator/Aligner with Custom Speaker Selection

Status: ⚠️ STILL REQUIRES CRITICAL CHANGES

I have reviewed this PR following up on the two previous Claude Code reviews (October 30, 2025 and November 26, 2025). Unfortunately, all critical blocking issues identified in both previous reviews remain unresolved. This PR cannot be merged until these fundamental problems are addressed.


🚨 BLOCKING ISSUES (Must Fix Before Merge)

1. Hardcoded Local File Paths - CRITICAL BLOCKER ❌

Locations: Lines 125, 133, 488 in the notebook

The notebook contains three hardcoded local file paths that will cause immediate failure for any user:

  • /Users/lokesh/Desktop/code/config/groq_gpt_oss_120.json
  • /Users/lokesh/Desktop/code/MCP_test/software_eng/files/
  • /Users/lokesh/Desktop/code/Newton/Newton_data/data/dermatology.csv

Impact: The notebook is completely non-functional for anyone except the original author.

Required Fix: Use the standard AG2 pattern: autogen.config_list_from_json(env_or_file="OAI_CONFIG_LIST")


2. Agent Name Inconsistencies - RUNTIME ERROR ❌

The routing logic contains critical mismatches:

  • Agent defined as CodeWriter but routing looks for "code_writer_agent"
  • Agent defined as process_completion but routing looks for "Proccesscompletion_Agent" (also misspelled)

Impact: This will cause KeyError exceptions at runtime when agent_aligner tries to route to these agents.


3. Typo in Example Code - BLOCKER ❌

Line 488: "form file path" should be "from file path"


4. Missing Installation Instructions - BLOCKER ❌

The notebook lacks proper MDX-formatted installation instructions that are standard across all AG2 notebooks.


5. Missing LLM Configuration Documentation - BLOCKER ❌

Following AG2 standards, you need to add documentation about configuring LLMs after the config_list cell.


🔒 CRITICAL SECURITY CONCERNS

6. Unsafe Code Execution Configuration - HIGH RISK ⚠️

human_input_mode="NEVER" removes all human oversight for code execution. Combined with no Docker sandboxing and instructions to use subprocess.popen for arbitrary pip installations, this creates significant security risks.

Recommended: Change to human_input_mode="TERMINATE" and consider Docker usage, or add prominent security warnings.


⚠️ HIGH PRIORITY ISSUES

  1. Excessive max_round=500 - Should be 20-30 (standard for AG2 notebooks)
  2. Unused Variable - today_date = datetime.now() is never used
  3. Magic Numbers Without Explanation - temperature=0.15, timeout=3000, etc.
  4. Inconsistent API Usage - GroupChatManager initialization should use groupchat= keyword
  5. plt.save vs plt.savefig - Multiple references to plt.save which does not exist (should be plt.savefig())

📝 CODE QUALITY ISSUES

  1. Fragile String-Based Routing - The notebook mentions using Pydantic but does not implement it
  2. Verbose System Messages - 110+ line system message with high token consumption
  3. No Tests Included - No unit or integration tests
  4. No Working Example - Example uses hardcoded paths, not runnable by others

✅ POSITIVE ASPECTS

  1. 📚 Outstanding Documentation - Exceptionally clear and well-structured
  2. 🎯 Innovative Architecture - The Agent Aligner pattern is creative and valuable
  3. 📊 Great Visual Tables - Very helpful for understanding components
  4. 🔄 Solid Workflow Design - Well thought out Plan → Write → Execute flow
  5. 🐛 Good Error Handling Concept - Smart Debugger agent integration
  6. 📖 Educational Value - Great learning resource once issues are fixed

📋 REQUIRED CHANGES CHECKLIST

MUST fix before merge:

  • [ ] Remove ALL hardcoded local file paths (3 locations)
  • [ ] Fix agent name inconsistencies
  • [ ] Fix "form" → "from" typo
  • [ ] Add standard MDX installation instructions
  • [ ] Add LLM configuration documentation
  • [ ] Address security concerns OR add prominent warnings
  • [ ] Reduce max_round from 500 to 20-30
  • [ ] Remove unused today_date variable
  • [ ] Fix plt.saveplt.savefig() (4 locations)
  • [ ] Fix GroupChatManager initialization
  • [ ] Provide a working example with accessible data

🎯 RECOMMENDATION

REQUEST CHANGES - This PR requires significant revisions before it can be merged.

The fundamental concept is excellent and would be a valuable addition to the AG2 documentation. However, three reviews have now identified the same blocking issues. Please prioritize fixing the 5 critical blockers listed at the top.

Once these are resolved, this will be a fantastic contribution to the AG2 community!


🤖 Reviewed by: Claude Code (Third Review)
📅 Date: December 15, 2025
🔗 Previous Reviews: October 30, 2025 | November 26, 2025

github-actions[bot] avatar Dec 15 '25 20:12 github-actions[bot]