[MODEL] Claude Code (Opus 4.5) guesses instead of verifying, gives confident wrong answers
Preflight Checklist
- [x] I have searched existing issues for similar behavior reports
- [x] This report does NOT contain sensitive information (API keys, passwords, etc.)
Type of Behavior Issue
Other unexpected behavior
What You Asked Claude to Do
During a debugging session for a GitHub Actions workflow failure, Claude Code repeatedly gave confident but incorrect answers instead of first researching the actual documentation and source code.
What Claude Actually Did
- Guessed at RubyGems trusted publisher configuration - Suggested multiple different configurations (changing repo name, workflow filename path) without first checking how trusted publishing actually works with reusable workflows.
- Didn't know reusable workflows don't work with RubyGems trusted publishing - This is documented in an open GitHub issue (rubygems/rubygems.org#4294) from 2023. Claude should have found this immediately instead of having me try multiple configurations.
- Suggested disabling MFA for gem pushes - A serious security anti-pattern. When I pushed back, Claude then claimed you could toggle MFA on/off per API key, which was also wrong for my account's MFA level.
- Said manual release creation would trigger a workflow_run workflow - Completely incorrect. The post-release workflow triggers on workflow_run from a specific workflow completing, not on release creation events.
- Missed the attestations: true default - After switching to API key auth, the workflow failed because attestations only work with trusted publishing. Claude should have caught this when reading the action source code.
- Broke the version grep - The grep pattern matched multiple lines containing "VERSION" instead of just the version assignment.
Root cause
Claude repeatedly made guesses and stated them confidently instead of:
- Reading source code first (rubygems/release-gem, rubygems/configure-rubygems-credentials)
- Searching for known issues
- Verifying claims before stating them
When I asked "are you guessing now?" after multiple failed attempts, Claude admitted it was.
Impact
- Multiple hours wasted
- Multiple failed workflow runs
- Had to downgrade RubyGems MFA settings from "UI and API" to "UI and gem signin"
- Release left incomplete (gem published, but GitHub Release and post-release workflow didn't run)
Expected Behavior
Claude should research before answering, especially for integration/configuration issues. If uncertain, say so upfront rather than giving confident wrong answers. The most expensive model should not require users to repeatedly ask "show me proof" to get accurate information.
Files Affected
Permission Mode
I don't know / Not sure
Can You Reproduce This?
Haven't tried to reproduce
Steps to Reproduce
No response
Claude Model
Opus
Relevant Conversation
Impact
Medium - Extra work to undo changes
Claude Code Version
2.0.64 (Claude Code)
Platform
Anthropic API
Additional Context
No response
Found 3 possible duplicate issues:
- https://github.com/anthropics/claude-code/issues/11913
- https://github.com/anthropics/claude-code/issues/8782
- https://github.com/anthropics/claude-code/issues/4011
This issue will be automatically closed as a duplicate in 3 days.
- If your issue is a duplicate, please close it and 👍 the existing issue instead
- To prevent auto-closure, add a comment or 👎 this comment
🤖 Generated with Claude Code
This is kind of normal for LLMs... there could be numerous causes; it's not likely to be caused by a software bug specifically. They had a big degredation problem that they published a deep infrastructure report on. That issue they describe in there is the only way. Also, you don't know if it's even true when it says it's making things up, that's just what it says. It's probably true, it's just not necessarily true.