claude-code [FEATURE] Skills Context Overflow Problem - Allow us to hide all skills except for router skills

Preflight Checklist

[x] I have searched existing requests and this feature hasn't been requested yet
[x] This is a single feature request (not multiple features)

Problem Statement

Skills Context Overflow Problem

Current Situation:

I've authored 100+ specialized skills organized behind ~10 router skills
Router skills are designed to dynamically load the appropriate specialized skill on-demand
Each skill's name + description consumes ~30-50 tokens in Claude's global context

The Problem: Without a way to hide or prioritize skills, all 100+ skills load their metadata into context automatically, causing:

Router crowding: The 10 router skills that Claude should prioritize are buried among 100+ specialized skills, making skill discovery unreliable
Wasted context budget: ~3,000-5,000 tokens consumed by metadata for skills that should only load on-demand when routed to
Scalability ceiling: Can only load 1-2 skill packs before hitting context limits, defeating the purpose of the router pattern

Proposed Solution

What's Needed: Two complementary mechanisms:

Hidden Skills (context efficiency)

Mark specialized skills as "hidden" so they don't consume context until explicitly invoked:

name: python-engineering:async-debugging description: Deep async/await debugging patterns hidden: true # Don't load metadata into global context

Router Priority (skill discovery)

Mark router skills with higher priority so Claude checks them first:

name: python-engineering description: Routes to specialized Python skills based on problem type router: true # Prioritize during skill selection

Combined Effect:

Claude sees only 10 router skills in context (~300-500 tokens)
Routers are checked first during skill discovery
Specialized skills load on-demand when router invokes them
Pattern scales to hundreds of skills without context bloat

Example Hierarchy: ✓ python-engineering (router: true, visible, ~40 tokens) ├── python-engineering:async-debugging (hidden: true, 0 tokens until invoked) ├── python-engineering:testing-patterns (hidden: true, 0 tokens until invoked) └── python-engineering:performance-profiling (hidden: true, 0 tokens until invoked)

✓ pytorch-engineering (router: true, visible, ~40 tokens) ├── pytorch-engineering:tensor-debugging (hidden: true, 0 tokens until invoked) └── pytorch-engineering:distributed-training (hidden: true, 0 tokens until invoked)

Benefits:

Supports large-scale skill libraries (100+ skills)
Efficient context usage (only routers in global context)
Reliable skill discovery (routers prioritized)
Enables proper separation of concerns (routing vs. specialized knowledge)

Alternative Solutions

Right now the manual work around is adding and disabling plugins as required however this requires reloading the claudecode and is severely disruptive.

Priority

High - Significant impact on productivity

Feature Category

Configuration and settings

Use Case Example

Scenario: A developer working on a PyTorch training loop encounters a CUDA out-of-memory error.

Without hidden/router fields (Current Behavior):

Developer: "My PyTorch training is crashing with CUDA OOM errors"

Claude's context at session start: Available skills (112 total, ~4,000 tokens):

python-engineering
python-engineering:async-debugging
python-engineering:testing-patterns
python-engineering:performance-profiling
pytorch-engineering
pytorch-engineering:tensor-debugging
pytorch-engineering:distributed-training
pytorch-engineering:memory-profiling
pytorch-engineering:mixed-precision ... (104 more skills)

Problem: Claude has to scan through 112 skill descriptions to find the right match. The pytorch-engineering router is buried. Claude might randomly pick pytorch-engineering:memory-profiling directly, bypassing the router's triage logic, or miss it entirely and not use any skill.

With hidden/router fields (Proposed Behavior):

Developer: "My PyTorch training is crashing with CUDA OOM errors"

Claude's context at session start: Available skills (10 routers, ~400 tokens):

python-engineering (router)
pytorch-engineering (router)
react-engineering (router)
testing-workflows (router)
database-engineering (router) ... (5 more routers)

Step-by-step flow:

Skill Discovery: Claude scans 10 router descriptions, immediately identifies pytorch-engineering router matches keywords: "PyTorch", "training", "CUDA"
Router Invocation: Claude invokes pytorch-engineering router
Router Logic: Router reads the error details and routes to the appropriate specialized skill: Symptoms: CUDA OOM, training loop → Loading pytorch-engineering:memory-profiling
On-Demand Loading: The hidden skill pytorch-engineering:memory-profiling loads its full content (0 tokens → ~2,000 tokens)
Problem Solving: Specialized skill provides deep diagnostic steps: - Check batch size vs. available VRAM - Inspect model parameter count - Check for memory leaks (detached tensors, cached gradients) - Suggest gradient accumulation or mixed-precision training

Benefits in this scenario:

✅ Fast skill discovery (10 vs. 112 skills to scan)
✅ Correct routing logic applied (router triages the problem)
✅ Efficient context usage (only loads relevant specialized skill)
✅ Deep expertise when needed (full skill content available after routing)

Another Example: Multi-Domain Projects

Developer working on a full-stack ML application:

Session 1: "I need to optimize my FastAPI endpoints" → Sees python-engineering router → Routes to python-engineering:async-patterns → Context: ~400 (routers) + ~2,000 (one skill) = ~2,400 tokens

Session 2: "My React frontend is re-rendering too much" → Sees react-engineering router → Routes to react-engineering:performance-optimization → Context: ~400 (routers) + ~2,000 (one skill) = ~2,400 tokens

Session 3: "My PyTorch model won't fit on the GPU" → Sees pytorch-engineering router → Routes to pytorch-engineering:memory-profiling → Context: ~400 (routers) + ~2,000 (one skill) = ~2,400 tokens

Without hidden/router fields:

Each session starts with ~4,000 tokens of skill metadata
Random skill selection bypasses router logic
Can't install all three skill packs (context overflow)

With hidden/router fields:

Each session starts with ~400 tokens of router metadata
Reliable routing to specialized skills
All three skill packs installed and working efficiently

Additional Context

No response

Nov 05 '25 08:11 tachyon-beep

Suggestion: put only the "router skills" in skills, and the rest in a docs/ folder??

Nov 06 '25 15:11 tino

Empirical Research Supporting This Issue

I conducted independent research that validates and quantifies the skill budget problem described here. Closing my duplicate issues (#13099, #13100) to consolidate discussion.

Key Findings

Metric	Value
Empirical budget	~15,500-16,000 characters
Per-skill overhead	~109 characters (XML tags, name, location)
With 63 skills	Only 42 visible (33% hidden)
With 92 skills	Only 36 visible (60% hidden, per #13044)

The Math

Total per skill = description_length + 109 chars overhead
Budget fills at ~15,700 characters total

Description Length	Skills That Fit
263 chars (typical)	~42 skills
150 chars	~60 skills
130 chars	~67 skills

Critical Clarification: Skills ≠ Tools

There's been some confusion linking this issue to #12836 (Tool Search Tool / defer_loading). This is incorrect.

TOOLS (API-level)           → defer_loading CAN help
SKILLS (available_skills)   → defer_loading DOES NOT apply

The Tool Search Tool beta solves context bloat for MCP tools, but it will NOT help with skill visibility. Skills require the solution proposed here (hidden: true, router: true).

Additional Suggestions

Beyond the excellent hidden/router proposal, consider:

Warnings/visibility: Show which skills are hidden, warn when installing skills that exceed budget
Documentation: Until this is fixed, document the ~16K limit so users can work around it
Workaround for now: Users can compress descriptions to ≤130 chars to fit more skills

Full Research

Detailed investigation with methodology, evidence, and reproduction steps: https://gist.github.com/alexey-pelykh/faa3c304f731d6a962efc5fa2a43abe1

This independently verifies findings from #12782 (within 3% of their calculations).

+1 for hidden: true and router: true - this would properly solve the scalability problem for skill libraries.

Dec 05 '25 08:12 alexey-pelykh

This issue has been inactive for 30 days. If the issue is still occurring, please comment to let us know. Otherwise, this issue will be automatically closed in 30 days for housekeeping purposes.

Jan 05 '26 10:01 github-actions[bot]