OpenHands icon indicating copy to clipboard operation
OpenHands copied to clipboard

[Feature Request]: Better Localization Support via Subtask or Delegation

Open xingyaoww opened this issue 1 year ago • 13 comments

@ryx2 pointed out we can include better ways for localization

Given a problem description -- we can allow the main agent to dispatch a localization subtask. The primary goal of the localization subtask is to get a list of relevant file based on the user provided input, and supply these to main agent as additional observation.

We can have different ways to implement this localization:

  • Another agent with ONLY read-only actions (e.g., file_editor than can only view/navigate across files)
  • Simpler RAG-based solution

Things to consider:

  • Should we make it an additional action for the agent, that will delegate to another LocalizationAgent when trigger. The input to the localization agent will have to be filled out by CodeAct Agent. The downside here is that we might lose information/context this way.
  • Or we simply do a classification given EVERY user message, and decide whether to dispatch a localization task. If so, we will dispatch it and put the result together along with the user message to send to the agent.

If you find this feature request or enhancement useful, make sure to add a 👍 to the issue

xingyaoww avatar Dec 06 '24 19:12 xingyaoww

Do we need anything more complex than file system + regex for reading imports?

ryx2 avatar Dec 06 '24 20:12 ryx2

Actually, if I look closely, I can tell that cursor.sh is using rag, because it has its rag search query in light grey text, see screenshots. So maybe there does not exist a SOTA system that can actually follow along spaghetti code

Screenshot 2024-12-06 at 3 47 38 PM Screenshot 2024-12-06 at 3 47 31 PM Screenshot 2024-12-06 at 3 47 22 PM

ryx2 avatar Dec 06 '24 23:12 ryx2

@xingyaoww if the spaghetti-code parser idea is more interesting to you, maybe something like this (the decorator approach I was discussing on our call, I just put in my repo):

TypeScript Method Tracing Utility

A lightweight, decorator-based tracing utility that automatically logs method entries, exits, and errors in your TypeScript classes. Perfect for debugging and monitoring method execution flow.

Features

  • 🔍 Automatic method tracing with zero code changes
  • 📝 Detailed logs including arguments, return values, and timing
  • ⚡ Async method support
  • 🎯 Method exclusion support
  • 📍 Automatic file path tracking
  • ⏱️ Performance timing for each method call

Usage

Simply add the @traceClass() decorator to any class you want to trace:

@traceClass()
class UserService {
  async createUser(name: string, email: string) {
    // ... implementation
  }

  async updateUser(id: number, data: UserData) {
    // ... implementation
  }
}

All methods will automatically generate trace logs like:

{
  "event": "function_call",
  "file": "lib/services/UserService.ts",
  "function": "createUser",
  "class": "UserService",
  "timestamp": "2024-01-17T01:42:51.123Z",
  "args": ["John Doe", "[email protected]"]
}

{
  "event": "function_return",
  "file": "lib/services/UserService.ts",
  "function": "createUser",
  "class": "UserService",
  "timestamp": "2024-01-17T01:42:51.456Z",
  "returnValue": { "id": 123, "name": "John Doe" },
  "duration": 333
}

Excluding Methods

You can exclude specific methods from tracing:

@traceClass({ excludeMethods: ["privateHelper", "internalMethod"] })
class UserService {
  async createUser(name: string) { ... }     // traced
  private privateHelper() { ... }            // not traced
  protected internalMethod() { ... }         // not traced
}

Error Tracking

Errors are automatically caught and logged:

{
  "event": "function_return",
  "file": "lib/services/UserService.ts",
  "function": "createUser",
  "class": "UserService",
  "timestamp": "2024-01-17T01:42:51.456Z",
  "error": {
    "name": "Error",
    "message": "User already exists",
    "stack": "Error: User already exists\n    at UserService.createUser ..."
  },
  "duration": 45
}

Benefits

  1. Debugging: Easily track method execution flow and identify issues
  2. Performance Monitoring: Built-in timing for each method call
  3. Error Tracking: Automatic error logging with stack traces
  4. Zero Overhead: No code changes needed in your methods
  5. Type Safety: Fully typed with TypeScript

Implementation Details

  • Uses TypeScript decorators for clean, declarative tracing
  • Automatically captures file paths relative to project root
  • Preserves method context and async functionality
  • Singleton pattern for the tracer to ensure consistent logging
  • Handles both successful returns and errors
  • Supports method exclusion for fine-grained control

But alas, peppering this @trace throughout the code perhaps may be tedious... I'll see if this improves the performance on my own repo and come back here to report

ryx2 avatar Dec 07 '24 01:12 ryx2

Yall are familiar with moatless? https://github.com/aorwall/moatless-tree-search?tab=readme-ov-file

ryx2 avatar Dec 11 '24 10:12 ryx2

here's the outputs of my implementation on the stack tracer, seems to slightly improve picking the right file but also >2x's the context

Image

ryx2 avatar Dec 19 '24 23:12 ryx2

Personally, I think we should have "code search" be a tool that we (optionally) provide to the agent. The input is a query, and the output is a set of code snippets that match the query. This would make it possible to make it implementation agnostic -- it could be an embedding-based retrieval model, or it could be an agent. We could then evaluate and iterate on that component independently.

neubig avatar Dec 20 '24 15:12 neubig

cc @ryanhoangt -- having the ability to debug/trace through program execution would also be an awesome thing to have for openhands ACI 🤔

xingyaoww avatar Dec 20 '24 19:12 xingyaoww

I rewrote this to be a global tracer file (in typescript), as opposed to just doing decorators in my repo. I like this more than the previous version because it says the line number in the file being executed. Slight issue would be that this would need to be done per supported language (although I would suspect just doing js/typescript + python would be a significant majority). Also, this makes it so that files are opt-out of debug prints, as opposed to opt in. Installed library code executions probably shouldn't be debug output, for an example. I think maybe just giving the agent this stack tracer as a tool, and then having it flag which files to print out at would go a long way based on my original stack tracer already helping the agent out a bit.

tl;dr see the "outputs will look like" section at the bottom.

// global-tracer.ts
import * as path from 'path';
import { readFileSync } from 'fs';
import { createHook } from 'async_hooks';
import { performance } from 'perf_hooks';

class GlobalTracer {
  private static instance: GlobalTracer;
  private indentLevels: Map<number, number> = new Map();
  private startTimes: Map<number, number> = new Map();
  
  private constructor() {
    // Initialize async hooks to track async context
    createHook({
      init: (asyncId, type, triggerAsyncId) => {
        const currentIndent = this.indentLevels.get(triggerAsyncId) || 0;
        this.indentLevels.set(asyncId, currentIndent);
      },
      destroy: (asyncId) => {
        this.indentLevels.delete(asyncId);
        this.startTimes.delete(asyncId);
      },
    }).enable();
  }

  static getInstance(): GlobalTracer {
    if (!GlobalTracer.instance) {
      GlobalTracer.instance = new GlobalTracer();
    }
    return GlobalTracer.instance;
  }

  private getIndentation(asyncId: number): string {
    const level = this.indentLevels.get(asyncId) || 0;
    return '  '.repeat(level);
  }

  private incrementIndent(asyncId: number): void {
    const currentLevel = this.indentLevels.get(asyncId) || 0;
    this.indentLevels.set(asyncId, currentLevel + 1);
  }

  private decrementIndent(asyncId: number): void {
    const currentLevel = this.indentLevels.get(asyncId) || 0;
    this.indentLevels.set(asyncId, Math.max(0, currentLevel - 1));
  }
}

// Set up the global tracer
require('v8-compile-cache'); // For faster startup
const tracer = GlobalTracer.getInstance();

// Get source map support for accurate stack traces
require('source-map-support').install();

// Override Function.prototype.apply to trace all function calls
const originalApply = Function.prototype.apply;
Function.prototype.apply = function(thisArg, args) {
  const asyncId = require('async_hooks').executionAsyncId();
  const stack = new Error().stack;
  const caller = stack.split('\n')[2];
  
  // Parse the stack trace for file and function info
  const stackMatch = caller.match(/at (.+) \((.+):(\d+):(\d+)\)/);
  if (stackMatch) {
    const [_, functionName, filePath, line, col] = stackMatch;
    const fileName = path.basename(filePath);
    
    // Skip internal Node.js calls and the tracer itself
    if (!fileName.includes('node_modules') && !fileName.includes('global-tracer')) {
      console.log(
        `${tracer.getIndentation(asyncId)}→ [${fileName}:${line}] ${functionName}`
      );
      if (args && args.length > 0) {
        console.log(
          `${tracer.getIndentation(asyncId)}  Arguments:`,
          args.map(arg => 
            typeof arg === 'function' ? '[Function]' : 
            arg && arg.toString().length > 100 ? `${arg.toString().slice(0, 100)}...` : 
            arg
          )
        );
      }
      
      tracer.incrementIndent(asyncId);
      const startTime = performance.now();
      
      try {
        const result = originalApply.call(this, thisArg, args);
        const duration = (performance.now() - startTime).toFixed(2);
        
        tracer.decrementIndent(asyncId);
        console.log(
          `${tracer.getIndentation(asyncId)}← [${fileName}:${line}] ${functionName} (${duration}ms)`,
          result !== undefined ? `returned: ${result}` : ''
        );
        
        return result;
      } catch (error) {
        tracer.decrementIndent(asyncId);
        console.error(
          `${tracer.getIndentation(asyncId)}× [${fileName}:${line}] ${functionName} threw:`,
          error
        );
        throw error;
      }
    }
  }
  
  return originalApply.call(this, thisArg, args);
};

// Example usage:
// Import this file at the start of your application:
// import './global-tracer';

// Now all function calls will be automatically traced
function example(x: number) {
  return helper(x) * 2;
}

function helper(x: number) {
  return x + 1;
}

example(5);

/* Output will look like:
→ [app.ts:45] example
  Arguments: [5]
  → [app.ts:49] helper
    Arguments: [5]
  ← [app.ts:49] helper (0.05ms) returned: 6
← [app.ts:45] example (0.15ms) returned: 12
*/

ryx2 avatar Dec 20 '24 20:12 ryx2

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] avatar Feb 13 '25 01:02 github-actions[bot]

This issue was closed because it has been stalled for over 30 days with no activity.

github-actions[bot] avatar Feb 20 '25 01:02 github-actions[bot]

not stale

xingyaoww avatar Feb 24 '25 17:02 xingyaoww

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] avatar May 15 '25 02:05 github-actions[bot]

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] avatar Jun 15 '25 02:06 github-actions[bot]

This issue was closed because it has been stalled for over 30 days with no activity.

github-actions[bot] avatar Jun 23 '25 02:06 github-actions[bot]