opik [issue-4114] [BE] Add model name normalization for dot-based Claude model variants

Details

Fixes cost estimation and vision capability detection for Claude model names that use dot notation (e.g., claude-3.5-sonnet, claude-sonnet-4.5, claude-haiku-4.5-20251001) by normalizing them to the hyphenated format (claude-3-5-sonnet, claude-sonnet-4-5, claude-haiku-4-5-20251001) used in the LiteLLM pricing database.

Problem: Users (especially via LangChain) specify Claude model names with dots (e.g., claude-3.5-sonnet), but the auto-generated pricing database from LiteLLM uses hyphens (e.g., claude-3-5-sonnet). This causes:

❌ Cost estimates to return zero/not display in the UI
❌ Vision capability detection to fail
❌ Inconsistent behavior across frontend and backend

Solution: Applied consistent dot-to-hyphen normalization across 3 locations:

CostService.java (Backend - Cost Calculation)
- Added findModelPrice() method with backwards-compatible fallback logic
- Added normalizeModelName() helper to replace dots with hyphens and lowercase
- Tries exact match first (maintains 100% backwards compatibility)
- Falls back to normalized name if exact match fails
- Added debug logging for troubleshooting
- Handles case variations (e.g., "Claude-3.5-Sonnet" → "claude-3-5-sonnet")
ModelCapabilities.java (Backend - Vision Detection)
- Extended normalize() method to include dot-to-hyphen conversion
- Ensures vision capability detection works with dot notation
- Prevents UI/backend inconsistencies
modelCapabilities.ts (Frontend - UI Vision Checks)
- Updated normalizeModelName() with dot-to-hyphen conversion
- Ensures UI correctly shows vision support for dot-notated models

Key Features:

✅ Backwards Compatible: Existing code continues to work unchanged
✅ Case Insensitive: Handles "Claude-3.5-Sonnet" and "CLAUDE-SONNET-4.5"
✅ Transparent Fallback: No breaking changes to public API
✅ Generic Solution: Handles all current and future Claude/Gemini model variants with dots
✅ Works with Auto-Generated Files: No modifications to pricing database needed
✅ Consistent Across Stack: Same normalization logic in backend and frontend

Change checklist

[x] User facing
[ ] Documentation update

Issues

Resolves #4114

Testing

Backend Tests

Comprehensive test coverage with 44 total tests (increased from 20):

CostServiceTest: 15 tests (consolidated from 9)

✅ Parameterized test with 13 test cases covering:
- Dot notation normalization (claude-3.5-sonnet → claude-3-5-sonnet)
- Case insensitivity (Claude-3.5-Sonnet, CLAUDE-SONNET-4.5)
- Backwards compatibility (claude-3-5-sonnet still works)
- Unknown model handling (returns zero gracefully)
- Versioned models (claude-sonnet-4.5-20250929)

ModelCapabilitiesTest: 29 tests (consolidated from 11)

✅ Parameterized test with 27 test cases covering:
- Known vision models (GPT-4o, Claude 3.5, Gemini 1.5)
- Non-vision models (GPT-3.5 Turbo, GPT-4 base)
- Case insensitivity, whitespace handling, provider prefixes
- Dot notation (claude-3.5-sonnet, gemini-1.5-pro)
- Pattern matching (Qwen VL models)
- Edge cases (null, blank, unknown models)

All tests passing (44/44):

Tests run: 44, Failures: 0, Errors: 0, Skipped: 0
- CostServiceTest: 15 tests ✅
- ModelCapabilitiesTest: 29 tests ✅

Debug logs confirm normalization working:

Found model price using normalized name. Original: 'claude-sonnet-4.5', Normalized: 'claude-sonnet-4-5'
Found model price using normalized name. Original: 'claude-haiku-4.5', Normalized: 'claude-haiku-4-5'
Found model price using normalized name. Original: 'claude-3.5-sonnet-20241022', Normalized: 'claude-3-5-sonnet-20241022'
Found model price using normalized name. Original: 'Claude-3.5-Sonnet-20241022', Normalized: 'claude-3-5-sonnet-20241022'

Manual Testing

✅ Verified exact model names continue to work (backwards compatibility)
✅ Verified dot-based names now return correct costs
✅ Verified case variations work (Claude-3.5-Sonnet, CLAUDE-SONNET-4.5)
✅ Verified vision detection works for dot-notated models
✅ Verified unknown models still return zero cost gracefully
✅ Verified frontend vision checks work consistently with backend

Code Review Changes

Revision 1: Addressed GitHub Copilot Comments

Simplified normalizeModelName() method
- Removed redundant null check (caller guarantees non-null)
- Documented precondition in JavaDoc
Renamed test for clarity
- calculateCost_shouldHandleMultipleDotsInModelName_issue4114 → calculateCost_shouldHandleUnknownModelWithDotsGracefully_issue4114
- Test name now accurately reflects behavior (graceful handling of unknown models)
- Updated assertion to check for exact zero

Revision 2: Addressed @andrescrz Review Comments ✅

All review comments have been addressed:

✅ Removed @Nullable annotations from private method parameters
- Follows project convention (nullability is assumed for private methods)
✅ Used StringUtils.isBlank for validation
- More robust: handles null, empty strings, and whitespace-only strings
- Changed from modelName == null || provider == null to StringUtils.isBlank(modelName) || StringUtils.isBlank(provider)
✅ Added lowercase normalization
- normalizeModelName() now converts to lowercase using Locale.ROOT
- Handles case variations: "Claude-3.5-Sonnet" → "claude-3-5-sonnet"
- Works for all model providers (Claude, Gemini, etc.)
✅ Added case-insensitive comparison
- Changed from !normalizedModelName.equals(modelName) to !normalizedModelName.equalsIgnoreCase(modelName)
- Ensures normalization is attempted for case-different models
✅ Consolidated duplicate tests into parameterized tests
- CostServiceTest: 6 individual tests → 1 parameterized test with 13 cases
- ModelCapabilitiesTest: 10 individual tests → 1 parameterized test with 27 cases
- Result: Improved maintainability, reduced code duplication
- Benefit: Easier to add new test cases, consistent test structure

Changes summary:

3 files changed, 118 insertions(+), 170 deletions(-)
Net reduction of 52 lines while increasing test coverage
All 44 tests passing ✅

Documentation

No documentation changes needed as this is an internal fix to the cost calculation and capability detection logic. The public API remains unchanged.

Files Changed

Backend:

apps/opik-backend/src/main/java/com/comet/opik/domain/cost/CostService.java
apps/opik-backend/src/main/java/com/comet/opik/domain/llm/ModelCapabilities.java
apps/opik-backend/src/test/java/com/comet/opik/domain/cost/CostServiceTest.java
apps/opik-backend/src/test/java/com/comet/opik/domain/llm/ModelCapabilitiesTest.java

Frontend:

apps/opik-frontend/src/lib/modelCapabilities.ts

Nov 27 '25 15:11 Nimrod007

Backend Tests Results

322 files 322 suites 49m 49s ⏱️ 5 682 tests 5 675 ✅ 7 💤 0 ❌ 5 648 runs 5 641 ✅ 7 💤 0 ❌

Results for commit 5c476333.

:recycle: This comment has been updated with latest results.

Nov 27 '25 16:11 github-actions[bot]

SDK E2E Tests Results

105 tests 104 ✅ 5m 16s ⏱️ 1 suites 0 💤 1 files 1 ❌

For more details on these failures, see this check.

Results for commit b9af3b2e.

Nov 27 '25 18:11 github-actions[bot]