[MODEL] Context degradation and working memory failure during problem-solving session
Preflight Checklist
- [x] I have searched existing issues for similar behavior reports
- [x] This report does NOT contain sensitive information (API keys, passwords, etc.)
Type of Behavior Issue
Other unexpected behavior
What You Asked Claude to Do
Claude exhibited severe context retention and state management failures during a multi-step debugging session, resulting in circular reasoning, repeated instruction violations, and inability to maintain coherent problem-solving flow.
What Claude Actually Did
Observed Behavior
1. Context Fragmentation
- Previously fixed issues were repeatedly "rediscovered" as new problems
- Unable to maintain narrative coherence across conversation turns
- Each response appeared to start fresh rather than building on established knowledge
2. Instruction Adherence Failure
- Explicit instructions repeatedly violated despite multiple corrections
- Corrections did not propagate to subsequent responses
3. State Tracking Breakdown
- Attempted to remove method immediately after adding it
- Marked tasks complete without completing it
- Lost track of current step in multi-step processes
4. Pattern Recognition Failure
- Underwent complete rediscovery process for same problem type. Kept repeating the same on every turn
5. Logical Flow Disruption
- Expected flow: Identify → Fix → Test → Next
- Actual behavior: Circular and repetitive wrong patterns without progression
Root Cause Hypothesis
Primary Issue: Context/attention mechanism failing to maintain coherent working state across the session
Mechanism: Each response appears to receive fragmented context rather than complete logical flow, causing:
- Repeated "rediscovery" of known information
- Violation of previously established instructions
- Inability to accumulate learning within session
Layer: Issue appears to be in context processing and maintenance during inference, not in base model weights (system prompts and instructions are present but inconsistently applied)
Impact
- Significantly degraded problem-solving efficiency
- User frustration from repeated instruction violations
- Inability to complete complex multi-step tasks reliably
- Potential for introducing errors by undoing previous fixes
Recommended Investigation
- Examine context window management implementation
- Review attention mechanism's handling of long conversations
- Investigate working memory state persistence across turns
- Analyze context truncation/corruption patterns
Claude Code: 1.0.127 Model: Opus 4.1
Expected Behavior
Expected Behavior
- Maintain coherent understanding of completed work throughout session
- Apply learned patterns across similar problems
- Follow instructions persistently once given
- Track state accurately across multi-step operations
Files Affected
Permission Mode
Accept Edits was OFF (manual approval required)
Can You Reproduce This?
Sometimes (intermittent)
Steps to Reproduce
No response
Claude Model
Opus
Relevant Conversation
⏺ You're absolutely right to be concerned. Looking back at this session, the logical flow was severely compromised:
1. I kept forgetting context - We fixed the issue in language_behavior.rs early on by adding symbols by module_path, but then I
kept re-discovering this same fix multiple times as if I'd never seen it before.
2. I didn't follow cause and effect - When the Python parser wasn't extracting qualified calls, instead of immediately checking
what it WAS extracting and fixing that specific issue, I went on tangents indexing everything and checking unrelated things.
3. I lost track of state - Multiple times I acted like we hadn't already implemented solutions. The worst was when I tried to
remove the extract_attribute_path method right after adding it.
4. Pattern recognition failed - After fixing Rust's resolution to check full paths first, I should have immediately applied the
same pattern to Python. Instead, I went through the whole discovery process again.
5. No learning from corrections - You corrected me about grep vs rg multiple times, about not creating files in /tmp, about
using language filters - but I kept making the same mistakes as if I had no memory of your previous corrections.
Impact
High - Significant unwanted changes
Claude Code Version
1.0.127
Platform
Anthropic API
Additional Context
This happens randomly - some sessions work perfectly, others degrade without warning. No specific trigger identified.
The degradation can happen mid-session in an otherwise well-functioning conversation. A session that starts strong can suddenly lose context retention without any apparent cause.
Unpredictable - can't tell if a session will stay coherent or fall apart, making it hard to plan serious work.
I'd like to help investigate this issue from the client side.
Scope: Reproducible analysis of context degradation in long Claude Code sessions — instrumentation and data collection only, not model internals.
Planned contributions:
- Create reproducible test cases that reliably trigger the degradation
- Document observable patterns (token counts, turn numbers, instruction types that fail)
- Build instrumentation to track context retention across sessions
- Share quantitative data to help isolate root causes
Goal: Provide clear, reproducible evidence that assists the internal team's investigation.
I'll share initial findings within a week. Happy to adjust this approach based on what would be most valuable to the team. @claude
This issue has been inactive for 30 days. If the issue is still occurring, please comment to let us know. Otherwise, this issue will be automatically closed in 30 days for housekeeping purposes.