claude-code Custom slash commands behave unpredictably compared to just pasting in a formatted prompt

Bug Description Custom slash commands with arguments behave unreliably compared to simply pasting in formatted prompts

Environment Info

Platform: macos
Terminal: iTerm.app
Version: 0.2.107
Feedback ID: 69025579-bd4d-429c-ad67-4e1f9a59d6ec

With a simple context priming slash command like this:

Simply explore the codebase to understand everything we need to know to answer the <user_query>. Don't actually write any code or plan changes, just find and read all the relevant files and explain why they are important.

Consider relevant classes and functions, depdencies and usages, and any other context that might be useful for the user to know.

<user_query>
$ARGUMENTS
</user_query>

And then implementing it like: /user:context_prime "We need to refactor our module structure to make it less coupled"

Claude will ignore the command and just start trying to implement the request. It's possible to get it to listen with enough prompt engineering e.g. providing multiple examples, but even adding something like this it still fails:

<important_rules>
**IMPORTANT: DO NOT ATTEMPT TO PLAN THE CHANGES, WRITE ANY CODE, OR ANSWER THE QUERY YOURSELF, YOU SHOULD ONLY FIND AND READ RELEVANT FILES FOR CONTEXT**
</important_rules>

This is in contrast to simply formatting the prompt with the request and pasting it in, which is adhered to every time even with minimal effort put into the prompt.

This is not a one-off example, I've found this with multiple different slash commands. They are very unreliable compared to a pasted prompt.

Errors

[{"error":"Error: Command failed: security find-generic-password -a $USER -w -s \"Claude Code\"\nsecurity: SecKeychainSearchCopyNext: The specified item could not be found in the keychain.\n\n    at genericNodeError (node:internal/errors:983:15)\n    at wrappedFn (node:internal/errors:537:14)\n    at checkExecSyncError (node:child_process:882:11)\n    at execSync (node:child_process:954:15)\n    at PW (file:///Users/dewiaddisedited/.nvm/versions/node/v23.10.0/lib/node_modules/@anthropic-ai/claude-code/cli.js:634:3394)\n    at file:///Users/dewiaddisedited/.nvm/versions/node/v23.10.0/lib/node_modules/@anthropic-ai/claude-code/cli.js:564:13915\n    at D (file:///Users/dewiaddisedited/.nvm/versions/node/v23.10.0/lib/node_modules/@anthropic-ai/claude-code/cli.js:503:12907)\n    at jW (file:///Users/dewiaddisedited/.nvm/versions/node/v23.10.0/lib/node_modules/@anthropic-ai/claude-code/cli.js:564:13436)\n    at XA6 (file:///Users/dewiaddisedited/.nvm/versions/node/v23.10.0/lib/node_modules/@anthropic-ai/claude-code/cli.js:2049:21067)\n    at file:///Users/dewiaddisedited/.nvm/versions/node/v23.10.0/lib/node_modules/@anthropic-ai/claude-code/cli.js:2192:639","timestamp":"2025-05-11T09:03:34.426Z"},{"error":"Error: Request was aborted.\n    at X3.makeRequest (file:///Users/dewiaddisedited/.nvm/versions/node/v23.10.0/lib/node_modules/@anthropic-ai/claude-code/cli.js:696:7727)\n    at processTicksAndRejections (node:internal/process/task_queues:105:5)\n    at runNextTicks (node:internal/process/task_queues:69:3)\n    at process.processTimers (node:internal/timers:543:9)","timestamp":"2025-05-11T09:03:41.319Z"}]

May 11 '25 09:05 djaddis

I have the same, albeit whilst running in --print mode. I suspect it isn't that it's not adhering to the prompt, it is just that the whole slash command isn't working, and the LLM is just running based on the verbatim command, not its contents!

See this log (running in GH Actions):

  claude -v
  ls -al .claude/commands/
  # Claude can only output json when using the non-interactive --print
  # mode. This jq helps to format it a bit more legible. From
  # https://github.com/anthropics/claude-code/issues/733#issuecomment-2790100083
  claude '/project:summarise_deps_updates $PR_NUMBER' \
    --print --verbose --output-format stream-json \
    --allowedTools 'Fetch(https://github.com/*),Bash(gh:*),Write,Edit' | tee | jq -r '
    if .message.content != null and (.message.content | type) == "array" then
      # Extract text content
      (.message.content[] | select(.type == "text") | "TEXT: " + .text),
      # Extract tool use
      (.message.content[] | select(.type == "tool_use") | "TOOL: " + .name + " - " + (.input|tostring))
    else
      "NON-ARRAY-CONTENT: " + (.|tostring)
    end'
  shell: /usr/bin/bash -e {0}
  env:
    PR_NUMBER: 9507
    ANTHROPIC_API_KEY: ***
    GH_TOKEN: ***
0.2.122 (Claude Code)
total 16
drwxr-xr-x 2 runner docker 4096 May 19 18:43 .
drwxr-xr-x 3 runner docker 4096 May 19 18:43 ..
-rw-r--r-- 1 runner docker 1224 May 19 18:43 create_pr.md
-rw-r--r-- 1 runner docker 1677 May 19 18:43 summarise_deps_updates.md
NON-ARRAY-CONTENT: {"type":"system","subtype":"init","session_id":"9c696350-7c5d-409a-9dde-76d8cd871b06","tools":["Task","Bash","Glob","Grep","LS","Read","Edit","MultiEdit","Write","NotebookRead","NotebookEdit","WebFetch","Batch","TodoRead","TodoWrite","WebSearch"],"mcp_servers":[]}
TEXT: I'll help you create a Python script to summarize dependencies updates for a PR. I'll need to use various tools to gather information about the PR and analyze the dependency changes.
TOOL: Glob - {"pattern":"pyproject.toml"}
TEXT: Let me search for all pyproject.toml files in the project:
TOOL: Glob - {"pattern":"**/pyproject.toml"}
TEXT: Let me create a Python script to automate the process of summarizing dependency updates for a PR. I'll write it to a file:
...

Sometimes it reads the command and pretty much perfectly does what it needs to (which is to output a certain markdown format, and not to write a script).

I'd say I see this fail in ~50% of the cases atm.

May 19 '25 19:05 tino

Regarding my previous comment:

...it is just that the whole slash command isn't working, and the LLM is just running based on the verbatim command, not its contents

Hmm, after a couple more runs I'm not sure this is actually the case, because it does pick up on things written in the command file. It just completely ignores most of the instructions, including the "!! DO not do anything else. Don't write scripts, don't write implementation plans. Stick to the steps above."...

May 20 '25 06:05 tino