agent-zero icon indicating copy to clipboard operation
agent-zero copied to clipboard

Feature: Team Agent, Adv. Terminal, MCP Tools, GPU/CPU Instruments & CUDA Docker

Open deciduus opened this issue 8 months ago • 4 comments

Summary
This PR introduces several major enhancements and fixes across the codebase, focusing on improved agent tooling, Docker support, and advanced media generation capabilities.


🚀 Added

  • Standalone CUDA Dockerfile:
    • Enables GPU-accelerated workflows for supported tasks.
  • Team Agent Tool:
    • New tool for collaborative agent operations.
  • Image Generation & Music Generation:
    • Both tools now support CPU-based (Debian slim) and GPU (CUDA) images.
    • Each uses a dedicated .sh script to create and manage an instruments_venv for dependencies.
    • Includes heartbeat mechanisms to prevent unnecessary terminal passback and API waste.

🛠️ Edited

  • Initialize.sh:
    • Added -f (force) flag to resolve permissions issues.
  • Team Agent Prompts:
    • Enhanced prompts for improved Team Agent support.
  • Code Execution & Input Tools:
    • Improved terminal management for complex coding tasks, including support for individual terminal reset.
    • Code execution tool now uses a rolling timeout window:
      • If idle for 10 seconds, execution is terminated and passed back.
      • No maximum execution time if the process remains responsive.

📝 Notes

  • These changes improve reliability, resource management, and developer experience for both agent and media generation workflows.
  • Please review the new Dockerfiles and scripts for compatibility with your environment.

deciduus avatar May 10 '25 22:05 deciduus

Added:

  • Task edit/update/delete methods, auto dependency toggling, team leader planning phase, and enhanced task document handling for persistent development between teams.

Edit:

  • Enhanced get_terminal_output in code_execution_tool.py to detect common shell prompt patterns via regex and break early.
  • Updated Code Execution Markdown to discourage multi-line f.write() calls to prevent UI errors.
  • Wrapped self.call_extensions("message_loop_prompts", ...) in agent.py in a try/except block to prevent extension errors (e.g., FAISS/KeyError) from crashing the agent loop.
  • Revised problem-solving/bug fixing guidance to discourage common indentation-error prone methods.
  • Updated Code Execution Markdown prompt to discourage line-by-line edits, multiple separate f.write calls, and discouraging complex string manipulation, while encouraging reading, implementing fixes, and checking syntax.
  • Implemented detection of repeating output patterns (last 128 chars repeated 5 times) in get_terminal_output of code_execution_tool.py to identify likely infinite loops, reset the specific session, and return a detailed message to the agent.
  • Updated Team Agent System Prompts.
  • Truncated looped information in loop detection to reduce redundant context for the LLM.
  • Encouraged cat > EOF multiline edits for code and documentation.

Fix:

  • _parse_number method in dirty_jason.py now gracefully handles lone '+', '-', or empty strings by returning them as strings, avoiding ValueError.
  • Addressed input tool session management issues.

deciduus avatar May 13 '25 17:05 deciduus

Feature: MCP Tool Integration, Auto-Setup, and Enhancements (Post-#384)

This PR builds upon the work in #384, primarily integrating Rafa's MCP tool implementation (from @Rafael-U #332, with commit history preserved via cherry-picking) and introducing several enhancements to make MCP server usage more robust and user-friendly.

Key Updates in this PR:

  • MCP Integration & Enhancements (based on #332):

    • Successfully integrated the core MCP tool functionality.
    • Added comprehensive setup documentation (docs/mcp_setup.md) to guide users through configuring MCP servers via the UI.
    • Implemented automatic global installation of MCP server packages (for npx based servers) on system restart, streamlining the setup process.
    • Enhanced MCP tool interactions by providing contextual reminders in the agent's history. This includes details of the original tool call, the related user prompt, and suggested next steps to help agents maintain context, especially with potentially long or complex MCP tool outputs.
    • Renamed mcp.py to python/helpers/mcp_handler.py to avoid potential namespace conflicts with the mcp library.
  • System & Configuration Improvements:

    • Included configuration optimizations specifically for Debian Slim and CUDA Docker images to ensure successful MCP dependency handling and installation.
    • Upgraded the system to Node.js 20.x to support modern npm/npx requirements, crucial for MCP server package management (this impacts pre-installation steps for Debian and handling within CUDA environments).

Current Status:

Initial reviews indicate that various MCPs are functioning as expected. There might still be some minor quirks to address. The code_execution.md prompt has not yet been slimmed down but will be attended to in a subsequent update.

This PR aims to significantly improve the MCP tooling experience within Agent Zero.

deciduus avatar May 14 '25 05:05 deciduus

Edits to clean up the PR based on Jan's feedback: -slimmed docker builds removing unneccissary packages from preinstall -cleaned up redunant mcp config in agent.py -cleaned up the mcp auto install and checks logs used for debugged so they're now concise and informative

deciduus avatar May 14 '25 21:05 deciduus

Added new instrument for retrieving youtube video transcripts easily by providing link.

Resolved merge conflicts.

deciduus avatar May 18 '25 21:05 deciduus