babel icon indicating copy to clipboard operation
babel copied to clipboard

Hebrew Locale Corrupts Subsequent Locale Formatting

Open kodzi opened this issue 5 months ago • 0 comments

Summary

Hebrew locale ('he') corrupts Babel's internal cache, causing subsequent format_date() calls with other locales to return Hebrew-formatted text instead of the requested locale.

Environment

  • Babel version: 2.16.0
  • Python version: 3.12.7
  • Platform: macOS 26.0.1 (arm64)
  • Architecture: 64bit

Description

When using babel.dates.format_date() with Hebrew locale ('he'), subsequent calls with other locales (specifically 'no', 'fr', and others) return Hebrew text instead of the requested locale formatting. This corruption persists throughout the Python session until the process is restarted.

Steps to Reproduce

Minimal Reproduction Case

from babel.dates import format_date
from datetime import datetime

# Create a datetime object
date_obj = datetime.utcnow()

# Test German locale (works correctly)
print("German before Hebrew:", format_date(date_obj, 'LLLL', 'de'))
# Output: "Oktober"

# Use Hebrew locale (works correctly)  
print("Hebrew:", format_date(date_obj, 'LLLL', 'he'))
# Output: "אוקטובר"

# Test Norwegian locale (CORRUPTED - returns Hebrew!)
print("Norwegian after Hebrew:", format_date(date_obj, 'LLLL', 'no'))
# Expected: "oktober"
# Actual: "אוקטובר" ❌

# Test French locale (CORRUPTED - returns Hebrew!)
print("French after Hebrew:", format_date(date_obj, 'LLLL', 'fr'))
# Expected: "octobre" 
# Actual: "אוקטובר" ❌

# Test German locale (still works correctly)
print("German after Hebrew:", format_date(date_obj, 'LLLL', 'de'))
# Output: "Oktober" ✅

Expected vs Actual Behavior

  • Expected: Each locale should return text in its own language
  • Actual: After Hebrew usage, certain locales return Hebrew text

Analysis

Cache Investigation

The issue is related to Babel's internal locale data cache (babel.localedata._cache):

import babel.localedata

# Before any locale usage
print("Initial cache:", list(babel.localedata._cache.keys()))
# Output: []

# After using German
format_date(date_obj, 'LLLL', 'de')
print("After German:", list(babel.localedata._cache.keys()))
# Output: ['root', 'de']

# After using Hebrew  
format_date(date_obj, 'LLLL', 'he')
print("After Hebrew:", list(babel.localedata._cache.keys()))
# Output: ['root', 'de', 'he']

# After using Norwegian (corrupted)
format_date(date_obj, 'LLLL', 'no')  # Returns Hebrew text!
print("After Norwegian:", list(babel.localedata._cache.keys()))
# Output: ['root', 'de', 'he', 'no']

Workaround

Clearing the cache after Hebrew usage fixes the corruption:

import babel.localedata

# Use Hebrew locale
format_date(date_obj, 'LLLL', 'he')

# Clear cache to prevent corruption
babel.localedata._cache.clear()

# Now other locales work correctly
print(format_date(date_obj, 'LLLL', 'no'))  # Returns "oktober" ✅
print(format_date(date_obj, 'LLLL', 'fr'))  # Returns "octobre" ✅

kodzi avatar Oct 14 '25 11:10 kodzi