Feature Request: Upload Python Script Files to AgentOps Dashboard

Open devin-ai-integration[bot] opened this issue 5 months ago • 0 comments

Feature Request: Upload Python Script Files to AgentOps Dashboard

Summary

Implement functionality to automatically upload Python script files being executed to the AgentOps dashboard/background, similar to the existing terminal log upload system.

Background

AgentOps currently has a robust system for capturing and uploading terminal logs to the dashboard via:

setup_print_logger() in agentops/logging/instrument_logging.py
upload_logfile() function that uploads logs when root span completes
InternalSpanProcessor that coordinates the upload process
V4 API client with upload_logfile() method

This same infrastructure can be extended to also capture and upload the Python script files being executed.

Requirements

Functional Requirements

Automatically detect and capture Python script files during execution
Upload captured files when the root span completes (similar to log upload timing)
Display uploaded files in the AgentOps dashboard alongside trace logs
Support multiple file uploads per session
Handle file size limits and filtering appropriately

Non-Functional Requirements

Minimal performance impact on script execution
Secure file handling and upload
Error handling for upload failures
Configurable file collection (allow users to opt-out)

Technical Design

1. File Collection Mechanism

Extend the existing logging infrastructure to also collect Python files:

Option A: Stack Trace Analysis

Use inspect module to analyze the call stack during span creation
Extract file paths from stack frames
Filter for .py files that are part of the user's project (exclude site-packages)

Option B: Import Hook Integration

Leverage the existing import monitoring system in agentops/instrumentation/__init__.py
Track imported modules that belong to the user's project
Collect file contents during import

Option C: Execution Context Tracking

Monitor sys.argv[0] and related execution context
Track files via __main__ module and related imports
Use sys.modules to identify user-defined modules

2. Storage and Upload

Reuse existing upload infrastructure:

# Extend agentops/logging/instrument_logging.py
_file_buffer = {}  # Dict[str, str] - filename -> content

def collect_script_file(filepath: str) -> None:
    """Collect a Python script file for upload."""
    if filepath.endswith('.py') and is_user_file(filepath):
        try:
            with open(filepath, 'r', encoding='utf-8') as f:
                _file_buffer[filepath] = f.read()
        except Exception as e:
            logger.debug(f"Failed to collect file {filepath}: {e}")

def upload_script_files(trace_id: int) -> None:
    """Upload collected script files to the API."""
    if not _file_buffer:
        return
    
    client = get_client()
    for filepath, content in _file_buffer.items():
        # Use existing upload_object method or create new upload_script_file method
        client.api.v4.upload_script_file(content, trace_id, filepath)
    
    _file_buffer.clear()

3. API Integration

Extend V4 API client in agentops/client/api/versions/v4.py:

def upload_script_file(self, body: Union[str, bytes], trace_id: int, filename: str) -> UploadedObjectResponse:
    """Upload a script file to the API."""
    # Similar to upload_logfile but with filename metadata
    headers = {
        **self.prepare_headers(), 
        "Trace-Id": str(trace_id),
        "Filename": filename
    }
    response = self.post("/v4/scripts/upload/", body, headers)
    # Handle response similar to upload_logfile

4. Integration Points

Modify InternalSpanProcessor in agentops/sdk/processors.py:

def on_end(self, span: ReadableSpan) -> None:
    if self._root_span_id and (span.context.span_id is self._root_span_id):
        try:
            upload_logfile(span.context.trace_id)
            upload_script_files(span.context.trace_id)  # Add this line
        except Exception as e:
            logger.error(f"Error uploading files: {e}")

Implementation Plan

Phase 1: Core File Collection (Week 1)

[ ] Implement basic file collection mechanism using stack trace analysis
[ ] Add file filtering logic to exclude system/library files
[ ] Create file buffer storage similar to log buffer
[ ] Add configuration option to enable/disable file collection

Phase 2: Upload Integration (Week 1-2)

[ ] Extend V4 API client with upload_script_file method
[ ] Integrate file upload into InternalSpanProcessor
[ ] Add error handling and logging for file upload failures
[ ] Write unit tests for file collection and upload

Phase 3: Dashboard Integration (Week 2-3)

[ ] Backend API endpoints to receive and store script files
[ ] Database schema updates to associate files with traces
[ ] Frontend components to display uploaded files in trace viewer
[ ] File viewer/syntax highlighting in dashboard

Phase 4: Production Readiness (Week 3-4)

[ ] Performance testing and optimization
[ ] File size limits and validation
[ ] Security review for file handling
[ ] Documentation and examples

Testing Strategy

Unit Tests

File collection logic with various project structures
Upload functionality with mock API responses
Error handling for file access issues
Configuration option behavior

Integration Tests

End-to-end file collection and upload
Multiple file scenarios
Large file handling
Network failure scenarios

Security Considerations

Validate file paths to prevent directory traversal
Implement file size limits to prevent abuse
Sanitize file content before storage
Respect user privacy settings and opt-out mechanisms

Future Enhancements

Support for other file types (JSON, YAML, etc.)
Selective file collection based on patterns
File diff tracking across sessions
Integration with version control systems

Dependencies

No new external dependencies required
Leverages existing OpenTelemetry and upload infrastructure
May need backend API changes for new endpoints

References

Existing log upload: agentops/logging/instrument_logging.py
V4 API client: agentops/client/api/versions/v4.py
Span processor: agentops/sdk/processors.py
Backend storage: AgentOps.Next/api/agentops/api/storage.py

Acceptance Criteria

[ ] Python script files are automatically collected during execution
[ ] Files are uploaded to AgentOps backend when trace completes
[ ] Files are visible in the dashboard trace viewer
[ ] Feature can be enabled/disabled via configuration
[ ] No significant performance impact on script execution
[ ] Comprehensive test coverage
[ ] Documentation updated with new feature

Jul 23 '25 15:07 devin-ai-integration[bot]