strix icon indicating copy to clipboard operation
strix copied to clipboard

feat: Unified Event ID Schema

Open yokoszn opened this issue 1 month ago • 0 comments

Problem

The codebase uses 6+ incompatible ID formats across different subsystems, making it impossible to:

  • Track events across their lifecycle
  • Reference events consistently between components
  • Build relationships between related events
  • Implement unified logging/tracing

Current State (Fragmented)

Component ID Format Example Location
Agent ID agent_{uuid8} agent_a1b2c3d4 state.py:8-9
Tracer message_id Sequential int 1, 2, 3 tracer.py:51
Inter-agent message msg_{uuid8} msg_e5f6g7h8 agents_graph_actions.py:301
Vulnerability report vuln-{seq4} vuln-0001 tracer.py:80
Note ID {uuid5} a1b2c notes_actions.py:69
Execution ID Sequential int 1, 2, 3 tracer.py:52
User message user_msg_{uuid8} user_msg_f1e2d3c4 agents_graph_actions.py:540
Report ID report_{uuid8} report_b5a6c7d8 agents_graph_actions.py:427
Run ID run-{uuid8} run-x9y8z7w6 tracer.py:29
TUI event ID chat_{int} / tool_{int} chat_5, tool_12 tui.py:780-795

Problems with Current Approach

  1. No cross-reference: Can't link a tool execution to its resulting message
  2. Collision risk: Sequential integers reset per-run, UUIDs vary in length
  3. No sorting: Mixed formats don't sort chronologically
  4. No type inference: Can't determine event type from ID alone
  5. Inconsistent generation: Some use uuid4().hex[:8], others [:5], others sequential

Proposed Solution: Unified Event ID Schema

Format

{type}_{run}_{seq:05d}_{rand:04x}

Example: msg_abc123_00042_f7e8

Component Description Purpose
{type} Event type prefix (3-5 chars) Type inference, filtering
{run} Run ID hash (6 chars) Scope identification
{seq:05d} Sequence number (5 digits) Chronological ordering
{rand:04x} Random suffix (4 hex) Collision prevention

Type Prefixes

Prefix Event Type Replaces
agent Agent lifecycle agent_{uuid8}
msg Chat messages Sequential int, msg_{uuid8}
tool Tool executions Sequential int
vuln Vulnerability reports vuln-{seq4}
note Agent notes {uuid5}
cred Credential findings (new)
run Run/session run-{uuid8}

Benefits

Property Current Proposed
Type inference ❌ Inconsistent ✅ From prefix
Chronological sort ❌ Mixed formats ✅ Sequence number
Cross-run uniqueness ⚠️ Partial ✅ Run hash included
Collision resistance ⚠️ Variable ✅ Random suffix
Human readable ⚠️ Some ✅ All

Implementation

New Module: strix/events/

# strix/events/schema.py
from enum import Enum
from dataclasses import dataclass

class EventType(Enum):
    AGENT = "agent"
    MESSAGE = "msg"
    TOOL = "tool"
    VULNERABILITY = "vuln"
    NOTE = "note"
    CREDENTIAL = "cred"
    RUN = "run"

# strix/events/registry.py
class EventRegistry:
    def __init__(self, run_id: str):
        self.run_hash = run_id[:6]
        self._seq = 0
    
    def create_id(self, event_type: EventType) -> str:
        self._seq += 1
        rand = uuid4().hex[:4]
        return f"{event_type.value}_{self.run_hash}_{self._seq:05d}_{rand}"

Migration Strategy

  1. Phase 1: Add EventRegistry, new code uses unified IDs
  2. Phase 2: Add event_id field alongside old IDs (dual support)
  3. Phase 3: Migrate existing ID usages to unified format
  4. Phase 4: Deprecate old ID fields

Files to Modify

File Changes
strix/events/__init__.py NEW - Package
strix/events/schema.py NEW - EventType enum
strix/events/registry.py NEW - EventRegistry class
strix/agents/state.py Add EventRegistry instance
strix/telemetry/tracer.py Use unified IDs
strix/tools/agents_graph/agents_graph_actions.py Use unified IDs
strix/tools/notes/notes_actions.py Use unified IDs
strix/tools/reporting/reporting_actions.py Use unified IDs

Related Issues

  • #144: --verbose/--debug CLI flags (logging infrastructure)
  • #145: File-backed tool results (references events by ID)
  • #146: Lost-in-middle mitigation (findings store uses event IDs)

Future Extensions

Once unified IDs are in place:

  • Event relationships: Parent/child, references between events
  • Importance tracking: Attach metadata to events by ID
  • Event replay: Reconstruct session from event log
  • Cross-agent tracing: Track events across agent hierarchy

yokoszn avatar Nov 27 '25 01:11 yokoszn