[FEATURE] Skills Context Overflow Problem - Allow us to hide all skills except for router skills
Preflight Checklist
- [x] I have searched existing requests and this feature hasn't been requested yet
- [x] This is a single feature request (not multiple features)
Problem Statement
Skills Context Overflow Problem
Current Situation:
- I've authored 100+ specialized skills organized behind ~10 router skills
- Router skills are designed to dynamically load the appropriate specialized skill on-demand
- Each skill's name + description consumes ~30-50 tokens in Claude's global context
The Problem: Without a way to hide or prioritize skills, all 100+ skills load their metadata into context automatically, causing:
- Router crowding: The 10 router skills that Claude should prioritize are buried among 100+ specialized skills, making skill discovery unreliable
- Wasted context budget: ~3,000-5,000 tokens consumed by metadata for skills that should only load on-demand when routed to
- Scalability ceiling: Can only load 1-2 skill packs before hitting context limits, defeating the purpose of the router pattern
Proposed Solution
What's Needed: Two complementary mechanisms:
- Hidden Skills (context efficiency)
Mark specialized skills as "hidden" so they don't consume context until explicitly invoked:
name: python-engineering:async-debugging description: Deep async/await debugging patterns hidden: true # Don't load metadata into global context
- Router Priority (skill discovery)
Mark router skills with higher priority so Claude checks them first:
name: python-engineering description: Routes to specialized Python skills based on problem type router: true # Prioritize during skill selection
Combined Effect:
- Claude sees only 10 router skills in context (~300-500 tokens)
- Routers are checked first during skill discovery
- Specialized skills load on-demand when router invokes them
- Pattern scales to hundreds of skills without context bloat
Example Hierarchy: ✓ python-engineering (router: true, visible, ~40 tokens) ├── python-engineering:async-debugging (hidden: true, 0 tokens until invoked) ├── python-engineering:testing-patterns (hidden: true, 0 tokens until invoked) └── python-engineering:performance-profiling (hidden: true, 0 tokens until invoked)
✓ pytorch-engineering (router: true, visible, ~40 tokens) ├── pytorch-engineering:tensor-debugging (hidden: true, 0 tokens until invoked) └── pytorch-engineering:distributed-training (hidden: true, 0 tokens until invoked)
Benefits:
- Supports large-scale skill libraries (100+ skills)
- Efficient context usage (only routers in global context)
- Reliable skill discovery (routers prioritized)
- Enables proper separation of concerns (routing vs. specialized knowledge)
Alternative Solutions
Right now the manual work around is adding and disabling plugins as required however this requires reloading the claudecode and is severely disruptive.
Priority
High - Significant impact on productivity
Feature Category
Configuration and settings
Use Case Example
Use Case Example
Scenario: A developer working on a PyTorch training loop encounters a CUDA out-of-memory error.
Without hidden/router fields (Current Behavior):
Developer: "My PyTorch training is crashing with CUDA OOM errors"
Claude's context at session start: Available skills (112 total, ~4,000 tokens):
- python-engineering
- python-engineering:async-debugging
- python-engineering:testing-patterns
- python-engineering:performance-profiling
- pytorch-engineering
- pytorch-engineering:tensor-debugging
- pytorch-engineering:distributed-training
- pytorch-engineering:memory-profiling
- pytorch-engineering:mixed-precision ... (104 more skills)
Problem: Claude has to scan through 112 skill descriptions to find the right match. The pytorch-engineering router is buried. Claude might randomly pick pytorch-engineering:memory-profiling directly, bypassing the router's triage logic, or miss it entirely and not use any skill.
With hidden/router fields (Proposed Behavior):
Developer: "My PyTorch training is crashing with CUDA OOM errors"
Claude's context at session start: Available skills (10 routers, ~400 tokens):
- python-engineering (router)
- pytorch-engineering (router)
- react-engineering (router)
- testing-workflows (router)
- database-engineering (router) ... (5 more routers)
Step-by-step flow:
- Skill Discovery: Claude scans 10 router descriptions, immediately identifies pytorch-engineering router matches keywords: "PyTorch", "training", "CUDA"
- Router Invocation: Claude invokes pytorch-engineering router
- Router Logic: Router reads the error details and routes to the appropriate specialized skill: Symptoms: CUDA OOM, training loop → Loading pytorch-engineering:memory-profiling
- On-Demand Loading: The hidden skill pytorch-engineering:memory-profiling loads its full content (0 tokens → ~2,000 tokens)
- Problem Solving: Specialized skill provides deep diagnostic steps: - Check batch size vs. available VRAM - Inspect model parameter count - Check for memory leaks (detached tensors, cached gradients) - Suggest gradient accumulation or mixed-precision training
Benefits in this scenario:
- ✅ Fast skill discovery (10 vs. 112 skills to scan)
- ✅ Correct routing logic applied (router triages the problem)
- ✅ Efficient context usage (only loads relevant specialized skill)
- ✅ Deep expertise when needed (full skill content available after routing)
Another Example: Multi-Domain Projects
Developer working on a full-stack ML application:
Session 1: "I need to optimize my FastAPI endpoints" → Sees python-engineering router → Routes to python-engineering:async-patterns → Context: ~400 (routers) + ~2,000 (one skill) = ~2,400 tokens
Session 2: "My React frontend is re-rendering too much" → Sees react-engineering router → Routes to react-engineering:performance-optimization → Context: ~400 (routers) + ~2,000 (one skill) = ~2,400 tokens
Session 3: "My PyTorch model won't fit on the GPU" → Sees pytorch-engineering router → Routes to pytorch-engineering:memory-profiling → Context: ~400 (routers) + ~2,000 (one skill) = ~2,400 tokens
Without hidden/router fields:
- Each session starts with ~4,000 tokens of skill metadata
- Random skill selection bypasses router logic
- Can't install all three skill packs (context overflow)
With hidden/router fields:
- Each session starts with ~400 tokens of router metadata
- Reliable routing to specialized skills
- All three skill packs installed and working efficiently
Additional Context
No response
Suggestion: put only the "router skills" in skills, and the rest in a docs/ folder??
Empirical Research Supporting This Issue
I conducted independent research that validates and quantifies the skill budget problem described here. Closing my duplicate issues (#13099, #13100) to consolidate discussion.
Key Findings
| Metric | Value |
|---|---|
| Empirical budget | ~15,500-16,000 characters |
| Per-skill overhead | ~109 characters (XML tags, name, location) |
| With 63 skills | Only 42 visible (33% hidden) |
| With 92 skills | Only 36 visible (60% hidden, per #13044) |
The Math
Total per skill = description_length + 109 chars overhead
Budget fills at ~15,700 characters total
| Description Length | Skills That Fit |
|---|---|
| 263 chars (typical) | ~42 skills |
| 150 chars | ~60 skills |
| 130 chars | ~67 skills |
Critical Clarification: Skills ≠ Tools
There's been some confusion linking this issue to #12836 (Tool Search Tool / defer_loading). This is incorrect.
TOOLS (API-level) → defer_loading CAN help
SKILLS (available_skills) → defer_loading DOES NOT apply
The Tool Search Tool beta solves context bloat for MCP tools, but it will NOT help with skill visibility. Skills require the solution proposed here (hidden: true, router: true).
Additional Suggestions
Beyond the excellent hidden/router proposal, consider:
- Warnings/visibility: Show which skills are hidden, warn when installing skills that exceed budget
- Documentation: Until this is fixed, document the ~16K limit so users can work around it
- Workaround for now: Users can compress descriptions to ≤130 chars to fit more skills
Full Research
Detailed investigation with methodology, evidence, and reproduction steps: https://gist.github.com/alexey-pelykh/faa3c304f731d6a962efc5fa2a43abe1
This independently verifies findings from #12782 (within 3% of their calculations).
+1 for hidden: true and router: true - this would properly solve the scalability problem for skill libraries.
This issue has been inactive for 30 days. If the issue is still occurring, please comment to let us know. Otherwise, this issue will be automatically closed in 30 days for housekeeping purposes.