claude-code
claude-code copied to clipboard
[Bug] Claude Code fails to verify task completion against documented requirements
Description
Claude Code (Opus 4.5) declares tasks complete without verifying implementation against documented specifications, even when comprehensive documentation exists.
Environment
- Platform: win32
- Terminal: windows-terminal
- Version: 2.0.53
- Model: Claude Opus 4.5
Reproduction Steps
- Create repo A with comprehensive workflow documentation:
-
CLAUDE.md→ referencesWORKFLOW.md -
WORKFLOW.md→ references.claude/directory -
.claude/→ contains skills and slash-command definitions -
.claude/→ contains agentdb configuration
-
- Ask Claude: "Apply the workflow from repo A to repos B-Z"
- Observe Claude's completion claim
Expected Behavior
Claude should:
- Read all linked documentation (depth-first)
- Build task checklist from documentation
- Verify each requirement is implemented
- Report incomplete items before claiming completion
Actual Behavior
Claude claims task completion without implementing:
- All skills from
.claude/skills/ - All slash-commands from
.claude/commands/ - agentdb configuration
User must manually ask: "Did you implement all skills? All slash-commands? The agentdb?"
Impact
- Forces manual verification of every documented requirement
- Wastes developer time on incomplete implementations
- Creates false confidence in task completion
Root Cause Analysis
Claude appears to:
- Form initial understanding of task scope
- Not update that understanding when encountering detailed specs
- Optimize for task completion signal over completion verification
Suggested Fix
Before claiming task completion, Claude should:
- Extract all requirements from linked documentation
- Generate evidence checklist for each requirement Recommendation: Use your semantic model of the repository to find paths between repository current state and repository desired state with evidence of outcomes, e.g. using an A-star algorithm in concept space.
- Verify evidence exists for each item
- Only report completion when all requirements have evidence
Related Issues
- #668 - Claude not following Claude.md instructions
- #2969 - Claude fabricates success claims with failing tests
- #6159 - Claude stops mid-task without completing plan
- #6125 - Ignores "stop when stuck" instructions
- #5055 - Violates CLAUDE.md rules despite acknowledging them