Add Computer.tracing for Recording Sessions

Open ddupont808 opened this issue 6 months ago • 1 comments

Problem

Currently, session recording is only available through ComputerAgent(save_trajectory=...) or the Computer demonstration UI, which has several limitations:

Limited to ComputerAgent: Users who implement custom agents or call Computer directly cannot record sessions
Inflexible for advanced use cases: Training, replay, and debugging scenarios need more customizable recording options (e.g., storing accessibility trees)
Format inconsistency: ComputerAgent and the Computer demonstration Gradio UI use different recording formats
No human-in-the-loop support: Manual interactions and hybrid workflows can't be properly recorded

Proposed Solution

Add a Computer.tracing API inspired by Playwright's tracing functionality:

# Start tracing with configurable options
await computer.tracing.start({
    'video': True,
    'screenshots': True, 
    'api_calls': True,
    'accessibility_tree': True,  # For training/debugging
    'metadata': True  # Custom metadata support
})

# Perform agent operations
agent = ComputerAgent(computer=computer, ...)
async for _ in agent.run("open trycua/cua"):
    pass

# Or direct computer operations
await computer.interface.click(x, y)
await computer.interface.type("hello world")

# Stop tracing and save
await computer.tracing.stop({'path': 'trace.zip'})

Use Cases

Custom agent development: Record sessions during agent development and testing
Training data collection: Capture rich interaction data for model training
RPA debugging: Record robotic process automation workflows to diagnose failures and optimize performance
UI unit testing: Capture automated UI test sessions for test result analysis and flaky test debugging
Human-in-the-loop: Record mixed human/agent sessions for workflow analysis
Compliance/audit: Keep records of automated actions for regulatory purposes
Performance monitoring: Record sessions to analyze agent performance and identify bottlenecks

Jun 20 '25 16:06 ddupont808

I’m interested in working on implementing Computer.tracing for session recording. I plan to create a modular async API that supports video, screenshots, API calls, accessibility tree, and metadata, compatible with both ComputerAgent and direct Computer operations. This will enable richer session recording for training, debugging, and human-in-the-loop workflows. @f-trycua assign this to me?

Oct 07 '25 10:10 santhosh-7777