superpowers icon indicating copy to clipboard operation
superpowers copied to clipboard

Gemini CLI Extension: Research Findings and Implementation Status

Open obra opened this issue 1 month ago • 2 comments

Summary

This issue documents findings from investigating Gemini CLI extension support for superpowers. The work is preserved in the wip-gemini-cli branch.

Goal

Enable superpowers skills to work with Google's Gemini CLI, similar to how they work with Claude Code and OpenCode.

What We Built

Extension Structure

.gemini/
├── gemini-extension.json    # Extension manifest
├── GEMINI.md               # Context file (auto-loaded)
├── INSTALL.md              # Installation documentation
├── .gitignore
└── mcp-server/
    ├── package.json
    ├── tsconfig.json
    └── src/
        └── superpowers.ts  # MCP server exposing skills as tools

MCP Server Architecture

  • Built with @modelcontextprotocol/sdk
  • Registers each skill as its own tool (brainstorming, test-driven-development, etc.)
  • Tools return skill content with [SKILL-LOADED: name at timestamp] marker for verification
  • Searches three locations for skills:
    • Project: ${cwd}/.gemini/skills/
    • Personal: ~/.config/gemini/skills/
    • Superpowers: Built-in skills from this repo

Key Findings

What Works

  1. Extension installation: gemini extensions install /path/to/superpowers/.gemini works
  2. MCP tools register: Individual skill tools appear in Gemini's tool list
  3. Manual tool calls: When explicitly asked to call a specific tool by name and show proof, it works
  4. Verification markers: [SKILL-LOADED:] markers prove actual tool execution vs. fake behavior

What Doesn't Work

GEMINI.md is Advisory, Not Executable

Unlike Claude Code's session hooks which inject mandatory instructions, Gemini CLI treats GEMINI.md as context/background information. We tried:

  • Direct instructions ("YOU MUST call find_skills first")
  • Pressure tactics ("If you don't call this, you have FAILED")
  • XML tags for emphasis
  • "Instead of" framing
  • Proof requirements

Result: All ignored. Gemini prefers responding directly over calling tools.

Auto-Triggering Skills Doesn't Work

We cannot make Gemini automatically invoke skills the way Claude Code does. Attempts included:

  • Generic find_skills + use_skill tools (like OpenCode)
  • Individual skill tools with descriptive names
  • MANDATORY language in tool descriptions
  • Bootstrap instructions in context files

The only reliable approach: User must explicitly ask for a skill by name.

Comparison with skillz

Tested skillz which claims automatic skill invocation. Findings:

  • Uses FastMCP (Python) vs our TypeScript implementation
  • Registers each skill as its own tool (we adopted this approach)
  • Same behavior: Auto-triggering doesn't actually work reliably
  • Their claim of "automatic" appears overstated

Fundamental Architecture Difference

Feature Claude Code Gemini CLI
Instruction injection Session hooks (mandatory) Context files (advisory)
Tool invocation Follows injected instructions User preference driven
Bootstrap capability Yes No

Successful Test

When explicitly demanded:

gemini "Call the 'brainstorming' tool and show me the output which should contain '[SKILL-LOADED:'"

Response included:

[SKILL-LOADED: brainstorming at 2025-11-27T18:23:44.615Z]

This proves the infrastructure works - the problem is triggering it automatically.

Possible Workarounds (Not Implemented)

  1. Custom wrapper script: Prepend skill invocation to every prompt
  2. Slash command analogs: Document that users should say "use brainstorming skill"
  3. Extension-provided prompts: Ship example prompts that invoke skills
  4. Wait for Gemini CLI improvements: May add hook-like functionality later

Files in Branch

  • .gemini/gemini-extension.json - Extension manifest
  • .gemini/GEMINI.md - Context file with bootstrap attempts
  • .gemini/INSTALL.md - User installation instructions
  • .gemini/mcp-server/ - Full MCP server implementation

Next Steps

  1. Wait and monitor: Gemini CLI is young; may add better extension hooks
  2. Document manual usage: If we ship this, document that users must explicitly invoke skills
  3. Consider wrapper approach: A shell wrapper could prepend bootstrap text to every prompt

Related Resources


Status: On ice. Infrastructure works but auto-triggering is blocked by Gemini CLI's architecture.

Branch: wip-gemini-cli

obra avatar Nov 28 '25 21:11 obra