[issue-4114] [BE] Add model name normalization for dot-based Claude model variants
Details
Fixes cost estimation and vision capability detection for Claude model names that use dot notation (e.g., claude-3.5-sonnet, claude-sonnet-4.5, claude-haiku-4.5-20251001) by normalizing them to the hyphenated format (claude-3-5-sonnet, claude-sonnet-4-5, claude-haiku-4-5-20251001) used in the LiteLLM pricing database.
Problem:
Users (especially via LangChain) specify Claude model names with dots (e.g., claude-3.5-sonnet), but the auto-generated pricing database from LiteLLM uses hyphens (e.g., claude-3-5-sonnet). This causes:
- ❌ Cost estimates to return zero/not display in the UI
- ❌ Vision capability detection to fail
- ❌ Inconsistent behavior across frontend and backend
Solution: Applied consistent dot-to-hyphen normalization across 3 locations:
-
CostService.java(Backend - Cost Calculation)- Added
findModelPrice()method with backwards-compatible fallback logic - Added
normalizeModelName()helper to replace dots with hyphens and lowercase - Tries exact match first (maintains 100% backwards compatibility)
- Falls back to normalized name if exact match fails
- Added debug logging for troubleshooting
- Handles case variations (e.g., "Claude-3.5-Sonnet" → "claude-3-5-sonnet")
- Added
-
ModelCapabilities.java(Backend - Vision Detection)- Extended
normalize()method to include dot-to-hyphen conversion - Ensures vision capability detection works with dot notation
- Prevents UI/backend inconsistencies
- Extended
-
modelCapabilities.ts(Frontend - UI Vision Checks)- Updated
normalizeModelName()with dot-to-hyphen conversion - Ensures UI correctly shows vision support for dot-notated models
- Updated
Key Features:
- ✅ Backwards Compatible: Existing code continues to work unchanged
- ✅ Case Insensitive: Handles "Claude-3.5-Sonnet" and "CLAUDE-SONNET-4.5"
- ✅ Transparent Fallback: No breaking changes to public API
- ✅ Generic Solution: Handles all current and future Claude/Gemini model variants with dots
- ✅ Works with Auto-Generated Files: No modifications to pricing database needed
- ✅ Consistent Across Stack: Same normalization logic in backend and frontend
Change checklist
- [x] User facing
- [ ] Documentation update
Issues
- Resolves #4114
Testing
Backend Tests
Comprehensive test coverage with 44 total tests (increased from 20):
CostServiceTest: 15 tests (consolidated from 9)
- ✅ Parameterized test with 13 test cases covering:
- Dot notation normalization (claude-3.5-sonnet → claude-3-5-sonnet)
- Case insensitivity (Claude-3.5-Sonnet, CLAUDE-SONNET-4.5)
- Backwards compatibility (claude-3-5-sonnet still works)
- Unknown model handling (returns zero gracefully)
- Versioned models (claude-sonnet-4.5-20250929)
ModelCapabilitiesTest: 29 tests (consolidated from 11)
- ✅ Parameterized test with 27 test cases covering:
- Known vision models (GPT-4o, Claude 3.5, Gemini 1.5)
- Non-vision models (GPT-3.5 Turbo, GPT-4 base)
- Case insensitivity, whitespace handling, provider prefixes
- Dot notation (claude-3.5-sonnet, gemini-1.5-pro)
- Pattern matching (Qwen VL models)
- Edge cases (null, blank, unknown models)
All tests passing (44/44):
Tests run: 44, Failures: 0, Errors: 0, Skipped: 0
- CostServiceTest: 15 tests ✅
- ModelCapabilitiesTest: 29 tests ✅
Debug logs confirm normalization working:
Found model price using normalized name. Original: 'claude-sonnet-4.5', Normalized: 'claude-sonnet-4-5'
Found model price using normalized name. Original: 'claude-haiku-4.5', Normalized: 'claude-haiku-4-5'
Found model price using normalized name. Original: 'claude-3.5-sonnet-20241022', Normalized: 'claude-3-5-sonnet-20241022'
Found model price using normalized name. Original: 'Claude-3.5-Sonnet-20241022', Normalized: 'claude-3-5-sonnet-20241022'
Manual Testing
- ✅ Verified exact model names continue to work (backwards compatibility)
- ✅ Verified dot-based names now return correct costs
- ✅ Verified case variations work (Claude-3.5-Sonnet, CLAUDE-SONNET-4.5)
- ✅ Verified vision detection works for dot-notated models
- ✅ Verified unknown models still return zero cost gracefully
- ✅ Verified frontend vision checks work consistently with backend
Code Review Changes
Revision 1: Addressed GitHub Copilot Comments
-
Simplified
normalizeModelName()method- Removed redundant null check (caller guarantees non-null)
- Documented precondition in JavaDoc
-
Renamed test for clarity
calculateCost_shouldHandleMultipleDotsInModelName_issue4114→calculateCost_shouldHandleUnknownModelWithDotsGracefully_issue4114- Test name now accurately reflects behavior (graceful handling of unknown models)
- Updated assertion to check for exact zero
Revision 2: Addressed @andrescrz Review Comments ✅
All review comments have been addressed:
-
✅ Removed
@Nullableannotations from private method parameters- Follows project convention (nullability is assumed for private methods)
-
✅ Used
StringUtils.isBlankfor validation- More robust: handles null, empty strings, and whitespace-only strings
- Changed from
modelName == null || provider == nulltoStringUtils.isBlank(modelName) || StringUtils.isBlank(provider)
-
✅ Added lowercase normalization
normalizeModelName()now converts to lowercase usingLocale.ROOT- Handles case variations: "Claude-3.5-Sonnet" → "claude-3-5-sonnet"
- Works for all model providers (Claude, Gemini, etc.)
-
✅ Added case-insensitive comparison
- Changed from
!normalizedModelName.equals(modelName)to!normalizedModelName.equalsIgnoreCase(modelName) - Ensures normalization is attempted for case-different models
- Changed from
-
✅ Consolidated duplicate tests into parameterized tests
- CostServiceTest: 6 individual tests → 1 parameterized test with 13 cases
- ModelCapabilitiesTest: 10 individual tests → 1 parameterized test with 27 cases
- Result: Improved maintainability, reduced code duplication
- Benefit: Easier to add new test cases, consistent test structure
Changes summary:
- 3 files changed, 118 insertions(+), 170 deletions(-)
- Net reduction of 52 lines while increasing test coverage
- All 44 tests passing ✅
Documentation
No documentation changes needed as this is an internal fix to the cost calculation and capability detection logic. The public API remains unchanged.
Files Changed
Backend:
apps/opik-backend/src/main/java/com/comet/opik/domain/cost/CostService.javaapps/opik-backend/src/main/java/com/comet/opik/domain/llm/ModelCapabilities.javaapps/opik-backend/src/test/java/com/comet/opik/domain/cost/CostServiceTest.javaapps/opik-backend/src/test/java/com/comet/opik/domain/llm/ModelCapabilitiesTest.java
Frontend:
apps/opik-frontend/src/lib/modelCapabilities.ts
Backend Tests Results
322 files 322 suites 49m 49s ⏱️ 5 682 tests 5 675 ✅ 7 💤 0 ❌ 5 648 runs 5 641 ✅ 7 💤 0 ❌
Results for commit 5c476333.
:recycle: This comment has been updated with latest results.
SDK E2E Tests Results
105 tests 104 ✅ 5m 16s ⏱️ 1 suites 0 💤 1 files 1 ❌
For more details on these failures, see this check.
Results for commit b9af3b2e.