Feature: Team Agent, Adv. Terminal, MCP Tools, GPU/CPU Instruments & CUDA Docker
Summary
This PR introduces several major enhancements and fixes across the codebase, focusing on improved agent tooling, Docker support, and advanced media generation capabilities.
🚀 Added
- Standalone CUDA Dockerfile:
- Enables GPU-accelerated workflows for supported tasks.
- Team Agent Tool:
- New tool for collaborative agent operations.
- Image Generation & Music Generation:
- Both tools now support CPU-based (Debian slim) and GPU (CUDA) images.
- Each uses a dedicated
.shscript to create and manage aninstruments_venvfor dependencies. - Includes heartbeat mechanisms to prevent unnecessary terminal passback and API waste.
🛠️ Edited
- Initialize.sh:
- Added
-f(force) flag to resolve permissions issues.
- Added
- Team Agent Prompts:
- Enhanced prompts for improved Team Agent support.
- Code Execution & Input Tools:
- Improved terminal management for complex coding tasks, including support for individual terminal reset.
- Code execution tool now uses a rolling timeout window:
- If idle for 10 seconds, execution is terminated and passed back.
- No maximum execution time if the process remains responsive.
📝 Notes
- These changes improve reliability, resource management, and developer experience for both agent and media generation workflows.
- Please review the new Dockerfiles and scripts for compatibility with your environment.
Added:
- Task edit/update/delete methods, auto dependency toggling, team leader planning phase, and enhanced task document handling for persistent development between teams.
Edit:
- Enhanced
get_terminal_outputincode_execution_tool.pyto detect common shell prompt patterns via regex and break early. - Updated Code Execution Markdown to discourage multi-line
f.write()calls to prevent UI errors. - Wrapped
self.call_extensions("message_loop_prompts", ...)inagent.pyin a try/except block to prevent extension errors (e.g., FAISS/KeyError) from crashing the agent loop. - Revised problem-solving/bug fixing guidance to discourage common indentation-error prone methods.
- Updated Code Execution Markdown prompt to discourage line-by-line edits, multiple separate
f.writecalls, and discouraging complex string manipulation, while encouraging reading, implementing fixes, and checking syntax. - Implemented detection of repeating output patterns (last 128 chars repeated 5 times) in
get_terminal_outputofcode_execution_tool.pyto identify likely infinite loops, reset the specific session, and return a detailed message to the agent. - Updated Team Agent System Prompts.
- Truncated looped information in loop detection to reduce redundant context for the LLM.
- Encouraged
cat > EOFmultiline edits for code and documentation.
Fix:
_parse_numbermethod indirty_jason.pynow gracefully handles lone '+', '-', or empty strings by returning them as strings, avoidingValueError.- Addressed input tool session management issues.
Feature: MCP Tool Integration, Auto-Setup, and Enhancements (Post-#384)
This PR builds upon the work in #384, primarily integrating Rafa's MCP tool implementation (from @Rafael-U #332, with commit history preserved via cherry-picking) and introducing several enhancements to make MCP server usage more robust and user-friendly.
Key Updates in this PR:
-
MCP Integration & Enhancements (based on #332):
- Successfully integrated the core MCP tool functionality.
- Added comprehensive setup documentation (
docs/mcp_setup.md) to guide users through configuring MCP servers via the UI. - Implemented automatic global installation of MCP server packages (for
npxbased servers) on system restart, streamlining the setup process. - Enhanced MCP tool interactions by providing contextual reminders in the agent's history. This includes details of the original tool call, the related user prompt, and suggested next steps to help agents maintain context, especially with potentially long or complex MCP tool outputs.
- Renamed
mcp.pytopython/helpers/mcp_handler.pyto avoid potential namespace conflicts with themcplibrary.
-
System & Configuration Improvements:
- Included configuration optimizations specifically for Debian Slim and CUDA Docker images to ensure successful MCP dependency handling and installation.
- Upgraded the system to Node.js 20.x to support modern
npm/npxrequirements, crucial for MCP server package management (this impacts pre-installation steps for Debian and handling within CUDA environments).
Current Status:
Initial reviews indicate that various MCPs are functioning as expected. There might still be some minor quirks to address. The code_execution.md prompt has not yet been slimmed down but will be attended to in a subsequent update.
This PR aims to significantly improve the MCP tooling experience within Agent Zero.
Edits to clean up the PR based on Jan's feedback: -slimmed docker builds removing unneccissary packages from preinstall -cleaned up redunant mcp config in agent.py -cleaned up the mcp auto install and checks logs used for debugged so they're now concise and informative
Added new instrument for retrieving youtube video transcripts easily by providing link.
Resolved merge conflicts.