vscode icon indicating copy to clipboard operation
vscode copied to clipboard

Hierarchical Instruction Discovery for GitHub Copilot

Open zentuit opened this issue 4 months ago • 1 comments

Feature Request: Hierarchical Instruction Discovery for GitHub Copilot

Summary

Add support for hierarchical discovery of GitHub Copilot instruction files by walking up the directory tree from the current file, similar to how tools like ESLint, Prettier, and TypeScript discover configuration files.

Problem Statement

Currently, GitHub Copilot loads all instruction files configured in chat.instructionsFilesLocations for every request, regardless of which file the developer is working on. This creates several issues in monorepos and large projects:

  1. Token Waste: Irrelevant instructions consume API tokens unnecessarily
  2. Performance Impact: Larger prompts slow down response times
  3. Context Pollution: Conflicting instructions from different project areas
  4. Poor Developer Experience: Cannot place instructions close to the code they govern

Current Behavior

Working on: backend/auth/services/user.service.ts
Instructions Loaded: ALL configured instruction files
Location: Centralized in .github/instructions/ only
Tokens Used: ~2000+ tokens for all instructions

Desired Behavior

Working on: backend/auth/services/user.service.ts
Instructions Discovered: 
  - backend/auth/services/.instructions.md     (most specific)
  - backend/auth/.instructions.md              (auth module)  
  - backend/.instructions.md                   (backend general)
  - .github/copilot-instructions.md            (project root)
Tokens Used: ~400 tokens for relevant instructions only

Proposed Solution

Hierarchical Discovery Algorithm

When Copilot needs instructions for a file, walk up the directory tree looking for *.instructions.md files:

File: backend/auth/services/user.service.ts

Discovery order:
1. backend/auth/services/*.instructions.md
2. backend/auth/*.instructions.md  
3. backend/*.instructions.md
4. *.instructions.md (root)
5. .github/copilot-instructions.md (fallback)

Configuration Options

{
  "github.copilot.chat.hierarchicalInstructions": true,
  "github.copilot.chat.instructionFileNames": [
    "*.instructions.md",
    ".copilot-instructions.md", 
    "copilot.md"
  ],
  "github.copilot.chat.maxInstructionDepth": 5,
  "github.copilot.chat.stopAtGitRoot": true
}

Example Directory Structure

project/
├── .github/
│   └── copilot-instructions.md          # Project-wide defaults
├── backend/
│   ├── backend.instructions.md          # Backend-specific rules
│   ├── auth/
│   │   ├── auth.instructions.md         # Authentication patterns
│   │   └── services/
│   │       ├── services.instructions.md # Service layer patterns
│   │       └── user.service.ts         # <- Working file
│   └── api/
│       ├── api.instructions.md          # API-specific rules
│       └── routes/
├── frontend/
│   ├── frontend.instructions.md         # Frontend-specific rules
│   ├── components/
│   │   ├── ui.instructions.md          # UI component patterns
│   │   └── forms/
│   │       └── form.instructions.md    # Form-specific patterns
│   └── pages/
└── data/
    ├── data.instructions.md             # Data processing rules
    └── pipelines/
        └── etl.instructions.md          # ETL-specific patterns

Benefits of This Approach

1. Proximity Principle

Instructions live close to the code they govern, making them easier to maintain and discover.

2. Natural Inheritance

More specific instructions automatically inherit and can override general ones:

Root: "Use TypeScript for all code"
Backend: "Use Express.js patterns" 
Auth: "Always validate JWT tokens"
Services: "Use dependency injection"

3. Reduced Token Usage

Only relevant instructions are loaded:

  • Working in frontend/: ~300 tokens for frontend instructions
  • Working in backend/auth/: ~400 tokens for backend + auth instructions
  • Working in data/pipelines/: ~250 tokens for data + ETL instructions

4. Team Collaboration

Teams can manage their own instruction files without central coordination:

  • Frontend team maintains frontend/*.instructions.md
  • Backend team maintains backend/*.instructions.md
  • DevOps team maintains deployment-related instructions

5. Gradual Migration

Existing centralized approaches continue to work while teams can gradually adopt hierarchical organization.

Use Cases

Large Monorepos

apps/
├── web-app/
│   ├── react.instructions.md
│   └── src/components/
├── mobile-app/
│   ├── react-native.instructions.md  
│   └── src/screens/
├── api/
│   ├── express.instructions.md
│   └── src/routes/
└── shared/
    ├── shared.instructions.md
    └── utils/

Microservices in Monorepo

services/
├── user-service/
│   ├── service.instructions.md      # Node.js + PostgreSQL patterns
│   └── src/
├── payment-service/
│   ├── service.instructions.md      # Go + Redis patterns  
│   └── cmd/
├── notification-service/
│   ├── service.instructions.md      # Python + Kafka patterns
│   └── app/

Component Libraries

packages/
├── ui-components/
│   ├── ui.instructions.md           # React component patterns
│   └── src/
├── utils/
│   ├── utils.instructions.md        # Utility function patterns
│   └── src/
├── icons/
│   ├── icons.instructions.md        # SVG and icon patterns
│   └── src/

Implementation Details

Discovery Algorithm

function discoverInstructions(filePath: string): InstructionFile[] {
  const instructions: InstructionFile[] = [];
  let currentDir = path.dirname(filePath);
  
  while (currentDir !== path.dirname(currentDir)) {
    // Look for instruction files in current directory
    const instructionFiles = glob.sync('*.instructions.md', { 
      cwd: currentDir 
    });
    
    instructions.push(...instructionFiles.map(f => ({
      path: path.join(currentDir, f),
      level: getRelativeDepth(filePath, currentDir)
    })));
    
    // Stop at git root if configured
    if (settings.stopAtGitRoot && fs.existsSync(path.join(currentDir, '.git'))) {
      break;
    }
    
    currentDir = path.dirname(currentDir);
  }
  
  return instructions.reverse(); // Root first, most specific last
}

Inheritance and Merging

  • Additive by default: Instructions from all levels are combined
  • Override mechanism: Use special syntax to override parent instructions
  • Priority order: More specific (deeper) instructions take precedence

Caching Strategy

  • Cache instruction discovery results per directory
  • Invalidate cache when instruction files change
  • Lazy load instruction content only when needed

Backward Compatibility

Migration Path

  1. Phase 1: New feature is opt-in via hierarchicalInstructions: true
  2. Phase 2: Existing chat.instructionsFilesLocations continues to work
  3. Phase 3: Gradual migration tools and documentation

Fallback Behavior

  • If no hierarchical instructions found, fall back to current behavior
  • Support both approaches simultaneously during transition
  • Clear documentation on migration strategies

Alternative Approaches Considered

1. Directory-specific configuration files

Rejected: Too complex, requires learning new config format

2. Workspace-level instruction mapping

Rejected: Still requires central configuration management

3. File-level instruction comments

Rejected: Pollutes code files, not maintainable

4. This hierarchical approach

Chosen: Familiar pattern, proximity principle, natural inheritance

Related Issues

This proposal builds upon and extends existing community requests:

This proposal extends these concepts with hierarchical discovery that walks up the directory tree, providing a more flexible and performance-optimized solution.

Related Patterns in the Ecosystem

This approach follows established patterns from popular development tools:

  • ESLint: .eslintrc.js files discovered hierarchically
  • TypeScript: tsconfig.json files with inheritance
  • Prettier: .prettierrc files walked up the tree
  • Babel: babel.config.js with natural override behavior
  • Jest: jest.config.js discovery and inheritance

Developers are already familiar with this pattern, making adoption natural.

Community Impact

Immediate Benefits

  • Reduced API costs for organizations using Copilot extensively
  • Better performance due to smaller, more relevant prompts
  • Improved developer experience with instructions close to code

Long-term Benefits

  • Better instruction maintenance through proximity and ownership
  • Natural team boundaries around instruction management
  • Scalable approach for large organizations and open source projects

Environment:

  • VSCode Version: 1.102.3
  • GitHub Copilot Extension Version: 1.350.0
  • GitHub Copilot Chat Extension Version: 0.29.1
  • Operating System: MacOS

Would you be willing to contribute to this feature? Yes

zentuit avatar Aug 01 '25 18:08 zentuit

This feature request is now a candidate for our backlog. The community has 60 days to upvote the issue. If it receives 20 upvotes we will move it to our backlog. If not, we will close it. To learn more about how we handle feature requests, please see our documentation.

Happy Coding!