`Locale` argument conversion not setting `language` and `country` properly
I am trying to write a parameterized test for a service that uses a resource bundle under the hood, loading the resource bundle with ResourceBundle.getBundle(String, Locale).
The test gets several Locale values in a @CsvSource annotation:
The documentation does not have an example for a country-based locale so I initially tried with:
@CsvSource({
"en_FR, value1",
"en_GB, value2",
"fr_FR, value3",
"fr_MC, value4",
"it_IT, value5",
...
})
void test(Locale locale, String expected) {
// call service using the resource bundle
// compare result with `expected`
}
All worked fine on Windows (local environment) but failed on Linux (CI).
After some digging, on Windows everything works because it's a case-insensitive file system, so for example the en_FR value is converted to a en_fr language-only locale (i.e., without country) and it successfully matches the file ending with en_FR because of case insensitivity.
Looking at the code, I think the problem is here:
https://github.com/junit-team/junit5/blob/4288cf1233b4af6002e922cb9ddd1b159fe2c9b3/junit-jupiter-params/src/main/java/org/junit/jupiter/params/converter/DefaultArgumentConverter.java#L265
which uses the Locale(String) constructor. The constructor Javadoc also explains the different behavior based on case sensitivity:
This constructor normalizes the language value to lowercase.
Looking at the test cases:
https://github.com/junit-team/junit5/blob/4288cf1233b4af6002e922cb9ddd1b159fe2c9b3/junit-jupiter-params/src/test/java/org/junit/jupiter/params/converter/DefaultArgumentConverterTests.java#L222-L223
Probably the second one is invalid. I would have expected that to be:
assertConverts("en_us", Locale.class, Locale.US);
which does not pass with the current implementation.
Steps to reproduce
As the documentation does not cover conversion of locales with country, I am not sure what would be the "right" test case to demonstrate the issue.
Assuming the IETF BCP 47 language tag format would be the right format for input values, this test fails already at the first assertion:
@ParameterizedTest
@ValueSource(strings = { "en-US" })
void test(Locale locale) {
assertEquals("en", locale.getLanguage());
assertEquals("US", locale.getCountry());
assertEquals(Locale.US, locale);
}
Workaround
Currently, I'm using a custom ArgumentConverter which delegates the conversion to Locale::forLanguageTag.
Context
- Used versions (Jupiter/Vintage/Platform): 5.8.2
- Build Tool/IDE: Intellij IDEA / Maven
Deliverables
- [ ]
DefaultArgumentConverterproperly converts IETF BCP 47 strings - [ ] The documentation has an example using the IETF BCP 47 language tag format
I propose to change the implementation at:
https://github.com/junit-team/junit5/blob/4288cf1233b4af6002e922cb9ddd1b159fe2c9b3/junit-jupiter-params/src/main/java/org/junit/jupiter/params/converter/DefaultArgumentConverter.java#L265
replacing Locale::new with Locale::forLanguageTag.
Happy to work on it if accepted.
@scordio IIUC that could potentially break existing tests that (wrongly or not) rely on the Locale(String) constructor being called, right?
@marcphilipp yes, but only if they specify a country and/or variant in the string. They would be broken like the second test case you currently have in the codebase.
However, such tests would have today a "wrong" Locale instance like the one in the reproducer and I'm not sure if they exist in reality. Up to you to judge if fixing this deserves a breaking change.
Tests that are specifying only a "plain" language (i.e., without country, variant, etc.) are safe from my point of view, which might be the most common use case.
Looking at Java 19, the Locale(String) constructor has been deprecated together with the other constructors and Locale::forLanguageTag is one of the suggested ways to obtain a Locale object.
In case you prefer to keep the current behavior thus avoiding a breaking change, an alternative could be an opt-in argument converter like @JavaTimeConversionPattern.
A few ideas about naming:
@LanguageTag@LanguageTagConversion@LanguageTagLocale
The implicit conversion could then be changed in the next major version of JUnit.
Looking at Java 19, the
Locale(String)constructor has been deprecated together with the other constructors andLocale::forLanguageTagis one of the suggested ways to obtain aLocaleobject.
We (JUnit Pioneer) also changed our implementation as the old constructor is deprecated, see JUnit Pioneer issue and JUnit Pioneer PR. Why do I note this here? Because some of the builder methods validates the input against a valid pattern - which the old constructor did not. So you might have the same discussion as we if you see this as a breaking change or not.
Thanks for the pointers, @Bukama!
As far as I can see, JUnit Pioneer favored the builder only when the fine-grained parameters are specified, otherwise Locale::forLanguageTag is used.
Do you see any issue with using Locale::forLanguageTag to solve the issue with the JUnit argument conversion?
Do you see any issue with using
Locale::forLanguageTagto solve the issue with the JUnit argument conversion?
I would not expect any issues as this method does not do a syntax check (as far as I understand the docs)
In case you prefer to keep the current behavior thus avoiding a breaking change, an alternative could be an opt-in argument converter like
@JavaTimeConversionPattern.A few ideas about naming:
* `@LanguageTag` * `@LanguageTagConversion` * `@LanguageTagLocale`The implicit conversion could then be changed in the next major version of JUnit.
As mentioned in https://github.com/junit-team/junit5/issues/2702#issuecomment-1518632314, a custom argument converter could be a candidate for JUnit Pioneer.
Team decision: Since the Locale constructor is not deprecated for removal, there's no urgency to change the existing behavior. Since switching from Locale::new to Locale::forLanguageTag would likely break existing tests, we would need to make it configurable one way or another. For now, we'll wait for additional interest from the community.
During the team discussion (resulting in the above team decision), I mentioned that Spring Framework dealt with this issue by introducing "lenient" Locale parsing in order to support both legacy locale formats as well as BCP 47 formats.
- https://github.com/spring-projects/spring-framework/issues/20736
- See
StringUtils.parseLocale()andStringUtils.parseLocaleString()for details on the solution.
Hi @marcphilipp and @sbrannen, I was going to submit a feature request to JUnit Pioneer for a custom argument converter, but stopped because I'm guessing that having the behavior configurable in JUnit would probably make the Pioneer converter obsolete.
I understand you're waiting for additional interest. I am definitely interested in having a solution, one way or another, but I also understand a single person's need doesn't count as a community interest 😅
I'm happy to help with the changes, but some initial design would of course be required.
In case you're open to discussing the solution further, here is my proposal: as I can hardly imagine test suites where you want to have a combination of the old and new conversion behavior for Locale, I would introduce a configuration parameter in the style of junit.jupiter.tempdir.scope, perhaps junit.jupiter.params.arguments.conversion.locale=default|lenient.
In case you don't want to invest more in this right now, that's totally fine and understood!
@scordio Wouldn't it rather be "ISO 639" vs. "BCP 47"?
@marcphilipp I guess you're referring to the values of the configuration property, right?
If the change would be the one I mentioned at https://github.com/junit-team/junit5/issues/3141#issuecomment-1410675040, yes, "ISO 639" vs. "BCP 47" would make more sense as property values.
In the previous comment, I proposed default vs. lenient following the example @sbrannen described at https://github.com/junit-team/junit5/issues/3141#issuecomment-1527426500 but probably it's not needed to have a mode supporting both at the same time (I actually mentioned the same in my previous comment... fighting with myself, it seems 🤦).
Just as a last comment, I was reading again the Javadoc of Spring's StringUtils.parseLocale(). Overall, Spring supports three variants:
Locale'stoString()format (e.g., "en", "en_US")Locale'stoString()format with spaces as separators (e.g., "en US")- BCP 47 (e.g. "en-US")
I tested how the two solutions would perform with such use cases:
Locale::getLanguage |
Locale::getCountry |
Locale::toString |
|
|---|---|---|---|
new Locale("en") |
en |
|
en |
Locale.forLanguageTag("en") |
en |
|
en |
new Locale("en_US") |
en_us |
|
en_us |
Locale.forLanguageTag("en_US") |
|
|
|
new Locale("en-US") |
en-us |
|
en-us |
Locale.forLanguageTag("en-US") |
en |
US |
en_US |
new Locale("en US") |
en us |
|
en us |
Locale.forLanguageTag("en US") |
|
|
|
Neither of the underscore and space use cases are adequately supported by either new Locale(String) or Locale.forLanguageTag(String). However, I don't know if JUnit should support these use cases at all.
My proposal would be to not support them as they are not backed by any standard.
Thoughts?
@marcphilipp if you don't have objections, I'll compose a PR in the next few days for this topic.
SGTM. Sorry for the delayed response. I was out for a few days.