chainlit
chainlit copied to clipboard
regex for language codes does not accept es-419
Describe the bug
When using Chainlit, setting Google Chrome's language to "español (Latinoamérica)" causes the application to fail with a 422 Unprocessable Entity error. The issue arises because Chainlit's language validation pattern does not accept the language code es-419, which corresponds to "español (Latinoamérica)". As a result, the application is unable to load translations and settings, preventing it from functioning properly.
To Reproduce
Steps to reproduce the behavior:
-
Set Google Chrome Language to "español (Latinoamérica)":
- Open Google Chrome.
- Click on the three dots in the upper-right corner and select Configuración (Settings).
- Scroll down and click on Configuración avanzada (Advanced) to expand advanced settings.
- Under Idiomas (Languages), click on Idioma (Language).
- Click Añadir idiomas (Add languages) and select "Español (Latinoamérica)".
- Click on the three dots next to "Español (Latinoamérica)" and select Mover al principio (Move to the top) to make it the default language.
- Restart Chrome to apply the changes.
-
Run a Chainlit Application:
- Open a terminal and run
chainlit helloto start a basic Chainlit application.
- Open a terminal and run
-
Open the Application:
- In Google Chrome, navigate to
http://localhost:8000.
- In Google Chrome, navigate to
-
Observe the Error:
- The application fails to load properly.
- Open Chrome's developer console (press
F12or right-click and select Inspeccionar (Inspect), then go to the Console tab). - Notice multiple
422 Unprocessable Entityerrors related to requests to/project/translationsand/project/settingswith the query parameterlanguage=es-419.
Expected behavior
Chainlit should accept the es-419 language code corresponding to "español (Latinoamérica)" and load the appropriate translations if available. If translations for es-419 are not available, the application should gracefully fall back to a default language (e.g., es for general Spanish or en for English) without causing errors. The application should load normally and be fully functional regardless of the browser's language settings.
Screenshots
Desktop (please complete the following information):
- OS: Windows 10
- Browser: Google Chrome
- Version: Versión 128.0.6613.138 (Build oficial) (64 bits)
Smartphone (please complete the following information):
Not applicable.
Additional context
-
Error Details:
The server returns the following error message:
{ "detail": [ { "type": "string_pattern_mismatch", "loc": [ "query", "language" ], "msg": "String should match pattern '^[a-zA-Z]{2,3}(-[a-zA-Z]{2,3})?(-[a-zA-Z]{2,8})?(-x-[a-zA-Z0-9]{1,8})?$'", "input": "es-419", "ctx": { "pattern": "^[a-zA-Z]{2,3}(-[a-zA-Z]{2,3})?(-[a-zA-Z]{2,8})?(-x-[a-zA-Z0-9]{1,8})?$" } } ] } -
Cause of the Issue:
The error occurs because Chainlit's validation regex for the
languagequery parameter does not accept numeric region codes like419. The regex pattern only allows alphabetic characters in the region and variant parts, soes-419(which corresponds to "español (Latinoamérica)") is rejected. -
Impact:
Users with Google Chrome set to "español (Latinoamérica)" cannot load Chainlit applications properly, affecting accessibility for Spanish-speaking users in Latin America and the Caribbean.
-
Workaround:
Changing Chrome's language setting to general Spanish (
es) or Spanish (Spain) (es-ES) allows the application to load correctly. However, this is not an ideal solution for end-users who prefer "español (Latinoamérica)". -
Suggested Fix:
-
Modify the Validation Regex:
Update the regex pattern in Chainlit's code to accept numeric region codes. For example:
^[a-zA-Z]{2,3} (-[a-zA-Z0-9]{2,3})? (-[a-zA-Z0-9]{2,8})? (-x-[a-zA-Z0-9]{1,8})?$This change allows numeric values in the region and variant parts, accommodating language codes like
es-419. -
Graceful Fallback:
Implement logic to default to a base language (e.g.,
es) if a specific regional variant is not supported. Ifes-419translations are not available, Chainlit should usees.jsonoren.jsonwithout causing errors.
-
-
References:
-
Chainlit Documentation:
The documentation mentions that translation files are named after the language code and that the language is dynamically set based on the browser's language. However, it does not specify limitations regarding numeric region codes.
-
IETF Language Tags:
According to the IETF BCP 47 standard, language tags like
es-419are valid and commonly used to represent regional variations.
-
-
Additional Notes:
-
Reproducing the Issue: The issue was observed exclusively on Google Chrome with the language set to "español (Latinoamérica)". Other browsers were not tested.
-
Translation Files: Attempting to add an
es-419.jsontranslation file did not resolve the issue due to the validation pattern rejecting thees-419code.
-