agentops icon indicating copy to clipboard operation
agentops copied to clipboard

Feature Request: Upload Python Script Files to AgentOps Dashboard

Open devin-ai-integration[bot] opened this issue 5 months ago • 0 comments

Feature Request: Upload Python Script Files to AgentOps Dashboard

Summary

Implement functionality to automatically upload Python script files being executed to the AgentOps dashboard/background, similar to the existing terminal log upload system.

Background

AgentOps currently has a robust system for capturing and uploading terminal logs to the dashboard via:

  • setup_print_logger() in agentops/logging/instrument_logging.py
  • upload_logfile() function that uploads logs when root span completes
  • InternalSpanProcessor that coordinates the upload process
  • V4 API client with upload_logfile() method

This same infrastructure can be extended to also capture and upload the Python script files being executed.

Requirements

Functional Requirements

  • Automatically detect and capture Python script files during execution
  • Upload captured files when the root span completes (similar to log upload timing)
  • Display uploaded files in the AgentOps dashboard alongside trace logs
  • Support multiple file uploads per session
  • Handle file size limits and filtering appropriately

Non-Functional Requirements

  • Minimal performance impact on script execution
  • Secure file handling and upload
  • Error handling for upload failures
  • Configurable file collection (allow users to opt-out)

Technical Design

1. File Collection Mechanism

Extend the existing logging infrastructure to also collect Python files:

Option A: Stack Trace Analysis

  • Use inspect module to analyze the call stack during span creation
  • Extract file paths from stack frames
  • Filter for .py files that are part of the user's project (exclude site-packages)

Option B: Import Hook Integration

  • Leverage the existing import monitoring system in agentops/instrumentation/__init__.py
  • Track imported modules that belong to the user's project
  • Collect file contents during import

Option C: Execution Context Tracking

  • Monitor sys.argv[0] and related execution context
  • Track files via __main__ module and related imports
  • Use sys.modules to identify user-defined modules

2. Storage and Upload

Reuse existing upload infrastructure:

# Extend agentops/logging/instrument_logging.py
_file_buffer = {}  # Dict[str, str] - filename -> content

def collect_script_file(filepath: str) -> None:
    """Collect a Python script file for upload."""
    if filepath.endswith('.py') and is_user_file(filepath):
        try:
            with open(filepath, 'r', encoding='utf-8') as f:
                _file_buffer[filepath] = f.read()
        except Exception as e:
            logger.debug(f"Failed to collect file {filepath}: {e}")

def upload_script_files(trace_id: int) -> None:
    """Upload collected script files to the API."""
    if not _file_buffer:
        return
    
    client = get_client()
    for filepath, content in _file_buffer.items():
        # Use existing upload_object method or create new upload_script_file method
        client.api.v4.upload_script_file(content, trace_id, filepath)
    
    _file_buffer.clear()

3. API Integration

Extend V4 API client in agentops/client/api/versions/v4.py:

def upload_script_file(self, body: Union[str, bytes], trace_id: int, filename: str) -> UploadedObjectResponse:
    """Upload a script file to the API."""
    # Similar to upload_logfile but with filename metadata
    headers = {
        **self.prepare_headers(), 
        "Trace-Id": str(trace_id),
        "Filename": filename
    }
    response = self.post("/v4/scripts/upload/", body, headers)
    # Handle response similar to upload_logfile

4. Integration Points

Modify InternalSpanProcessor in agentops/sdk/processors.py:

def on_end(self, span: ReadableSpan) -> None:
    if self._root_span_id and (span.context.span_id is self._root_span_id):
        try:
            upload_logfile(span.context.trace_id)
            upload_script_files(span.context.trace_id)  # Add this line
        except Exception as e:
            logger.error(f"Error uploading files: {e}")

Implementation Plan

Phase 1: Core File Collection (Week 1)

  • [ ] Implement basic file collection mechanism using stack trace analysis
  • [ ] Add file filtering logic to exclude system/library files
  • [ ] Create file buffer storage similar to log buffer
  • [ ] Add configuration option to enable/disable file collection

Phase 2: Upload Integration (Week 1-2)

  • [ ] Extend V4 API client with upload_script_file method
  • [ ] Integrate file upload into InternalSpanProcessor
  • [ ] Add error handling and logging for file upload failures
  • [ ] Write unit tests for file collection and upload

Phase 3: Dashboard Integration (Week 2-3)

  • [ ] Backend API endpoints to receive and store script files
  • [ ] Database schema updates to associate files with traces
  • [ ] Frontend components to display uploaded files in trace viewer
  • [ ] File viewer/syntax highlighting in dashboard

Phase 4: Production Readiness (Week 3-4)

  • [ ] Performance testing and optimization
  • [ ] File size limits and validation
  • [ ] Security review for file handling
  • [ ] Documentation and examples

Testing Strategy

Unit Tests

  • File collection logic with various project structures
  • Upload functionality with mock API responses
  • Error handling for file access issues
  • Configuration option behavior

Integration Tests

  • End-to-end file collection and upload
  • Multiple file scenarios
  • Large file handling
  • Network failure scenarios

Security Considerations

  • Validate file paths to prevent directory traversal
  • Implement file size limits to prevent abuse
  • Sanitize file content before storage
  • Respect user privacy settings and opt-out mechanisms

Future Enhancements

  • Support for other file types (JSON, YAML, etc.)
  • Selective file collection based on patterns
  • File diff tracking across sessions
  • Integration with version control systems

Dependencies

  • No new external dependencies required
  • Leverages existing OpenTelemetry and upload infrastructure
  • May need backend API changes for new endpoints

References

  • Existing log upload: agentops/logging/instrument_logging.py
  • V4 API client: agentops/client/api/versions/v4.py
  • Span processor: agentops/sdk/processors.py
  • Backend storage: AgentOps.Next/api/agentops/api/storage.py

Acceptance Criteria

  • [ ] Python script files are automatically collected during execution
  • [ ] Files are uploaded to AgentOps backend when trace completes
  • [ ] Files are visible in the dashboard trace viewer
  • [ ] Feature can be enabled/disabled via configuration
  • [ ] No significant performance impact on script execution
  • [ ] Comprehensive test coverage
  • [ ] Documentation updated with new feature