cline icon indicating copy to clipboard operation
cline copied to clipboard

vscode-lm api correct model capabilities

Open johnib opened this issue 7 months ago • 1 comments

Description

This PR fixes incorrect context window limits for Claude models when using the VS Code Language Model (LM) provider. Previously, Cline was falling back to default 128K token context windows and truncating conversations at ~98K tokens for Claude models, rather than using the actual limits reported by GitHub Copilot.

Root Cause: The VS Code LM handler wasn't properly retrieving model information from GitHub Copilot API, causing it to fall back to default context window limits. Through network traffic analysis, we discovered the actual GitHub Copilot API responses:

  • Claude 3.5 Sonnet: 90K tokens (not the 128K fallback)
  • Claude 3.7 Sonnet: 200K tokens
  • Claude Sonnet 4: 80K tokens (not the 128K fallback)
  • Claude Opus 4: 80K tokens (not the 128K fallback)

Solution: Implemented a comprehensive hard-coded model registry (VS_CODE_LM_MODEL_REGISTRY) in the VS Code LM handler that provides consistent model information based on GitHub Copilot's actual API responses. The registry includes 35+ models with correct context windows, pricing, and capabilities as reported by GitHub Copilot.

Key Changes:

  • Added VS_CODE_LM_MODEL_REGISTRY with GitHub Copilot's actual reported specifications for all models
  • Enhanced getModel() method to prioritize registry entries over fallback defaults
  • Added debug logging to show when registry entries are used vs when using VS Code API values
  • Fixed Claude model context windows to use GitHub Copilot's actual limits (90K for 3.5 Sonnet, 80K for Sonnet 4, 200K for 3.7 Sonnet)
  • Maintained backward compatibility with fallback to VS Code API for unknown models

Test Procedure

Confidence Level: High

  • Changes are isolated to the VS Code LM provider (src/api/providers/vscode-lm.ts)
  • Registry lookup is non-destructive - falls back to existing behavior for unknown models
  • No changes to core context management logic - only provides accurate model specifications
  • Debug logging provides visibility into registry usage vs fallback behavior

Expected Results:

  • Context window display will show correct limits based on GitHub Copilot's actual constraints:
    • Claude 3.5 Sonnet: "X / 90K tokens used" (min of 90K prompt limit and 90K context limit)
    • Claude 3.7 Sonnet: "X / 90K tokens used" (min of 90K prompt limit and 200K context limit)
    • Claude Sonnet 4: "X / 80K tokens used" (min of 80K prompt limit and 80K context limit)
    • GPT-4o: "X / 64K tokens used" (min of 64K prompt limit and 128K context limit)
    • GPT-4o mini: "X / 12K tokens used" (min of 12K prompt limit and 128K context limit)
  • Context truncation will trigger at appropriate thresholds based on actual GitHub Copilot limits instead of generic 128K fallback
  • Debug console will show when registry entries are used vs VS Code API fallback

Type of Change

  • [x] 🐛 Bug fix (non-breaking change which fixes an issue)
  • [ ] ✨ New feature (non-breaking change which adds functionality)
  • [ ] 💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • [ ] ♻️ Refactor Changes
  • [ ] 💅 Cosmetic Changes
  • [ ] 📚 Documentation update
  • [ ] 🏃 Workflow Changes

Pre-flight Checklist

  • [x] Changes are limited to a single feature, bugfix or chore (split larger changes into separate PRs)
  • [ ] Tests are passing (npm test) and code is formatted and linted (npm run format && npm run lint)
  • [ ] I have created a changeset using npm run changeset (required for user-facing changes)
  • [ ] I have reviewed contributor guidelines

Screenshots

Before Fix: Context window display showed: "45,123 / 128K tokens used (35%)" for Claude models (using fallback defaults) Conversations were truncated at ~98K tokens

After Fix: Context window display now shows proper GitHub Copilot limits:

  • Claude 3.5 Sonnet: "45,123 / 90K tokens used (50%)"
  • Claude 3.7 Sonnet: "45,123 / 90K tokens used (50%)" (limited by prompt constraint)
  • Claude Sonnet 4: "45,123 / 80K tokens used (56%)"
  • GPT-4o: "45,123 / 64K tokens used (70%)" (limited by prompt constraint)
  • GPT-4o mini: "45,123 / 12K tokens used (376%)" (limited by prompt constraint)

[!IMPORTANT] Fixes incorrect context window limits for Claude models in VS Code LM by implementing a hard-coded model registry with accurate specifications from GitHub Copilot.

  • Behavior:
    • Fixes incorrect context window limits for Claude models in vscode-lm.ts by using actual limits from GitHub Copilot.
    • Implements VS_CODE_LM_MODEL_REGISTRY to provide accurate model specifications for 35+ models.
    • Enhances getModel() to prioritize registry entries over default values.
    • Adds debug logging to indicate registry usage vs fallback.
  • Backward Compatibility:
    • Maintains fallback to VS Code API for unknown models.
  • Misc:
    • Refactors imports in vscode-lm.ts for better organization.

This description was created by Ellipsis for 39868a71db112ce69e31e6929767fd5a46f02495. You can customize this summary. It will automatically update as commits are pushed.

johnib avatar May 30 '25 11:05 johnib

🦋 Changeset detected

Latest commit: dd7e1f5b1b209e716528ccbb5dc18876993ed8fd

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
claude-dev Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

changeset-bot[bot] avatar May 30 '25 11:05 changeset-bot[bot]

Apparently there are more changes to be made, around the shared/api.ts file and the apiOptions.tsx.

Working on it

johnib avatar May 31 '25 20:05 johnib