strix
strix copied to clipboard
feat: Lost-in-Middle Mitigation with File-Backed Critical Findings
Problem
LLMs have degraded attention for content in the middle of long contexts (~20% recall vs ~80% at start/end). In Strix:
- Critical security findings discovered early get buried in summarized middle sections
- Current compression:
[system] + [summaries] + [recent_15]puts summaries in worst attention position - No mechanism to promote important findings to high-attention positions
Current Flow (Problematic)
Iteration 1: Find SQLi vulnerability
Iteration 20: SQLi finding now in compressed middle → degraded attention
Iteration 50: Agent may "rediscover" or miss the finding entirely
Proposed Solution
1. New Context Structure
[SYSTEM PROMPT] ← Good attention (first)
[CRITICAL FINDINGS REF] ← Good attention (near first) ← NEW
[Compressed history] ← Acceptable (middle, low-importance only)
[Recent 15 messages] ← Best attention (end)
2. File-Backed Critical Findings
Full critical findings stored to strix_runs/{run}/critical_findings.json, with compact reference in context:
<critical_findings count="3">
- [vuln-0001] SQL Injection /api/users (Critical)
- [vuln-0002] XSS in search param (High)
- [cred-001] Admin API token found
<hint>Use read_finding(id) for details</hint>
</critical_findings>
3. New Tools
read_finding(id)- Retrieve full details of a findingmark_finding_critical(id, reason)- Promote finding severity (upgrade only, no downgrade)
4. Auto-Population
report_vulnerability tool automatically adds findings to the store.
Implementation
New Files
strix/findings/__init__.py- Package initstrix/findings/store.py- FindingsStore classstrix/tools/findings/findings_actions.py- Toolsstrix/tools/findings/findings_actions_schema.xml- Tool schemas
Modified Files
strix/agents/state.py- Add findings_store fieldstrix/llm/memory_compressor.py- Inject findings block after system promptstrix/llm/llm.py- Pass findings_store to compressorstrix/tools/reporting/reporting_actions.py- Auto-add vulns to store
Benefits
| Before | After |
|---|---|
| Critical findings buried in middle | Always in high-attention position |
| Full details bloat context | Summaries in context, details on-demand |
| No importance tracking | Severity-based prioritization |
| Findings may be "forgotten" | Persistent across all iterations |
Related
- #145: File-backed tool results (similar pattern)