datafaker icon indicating copy to clipboard operation
datafaker copied to clipboard

Make locale use more consistent with the Java Locale API

Open abeshenov opened this issue 3 years ago • 0 comments

The use of locales as suggested by README.md seems to be inconsistent with the Java Locale API. Namely, it gives as an example

new Faker(new Locale("en-us")).address().zipCodeByState("CA"));

Here Locale("en-us") is an incorrect locale with language en-us. This is not the same as Locale.US, which is Locale("en", "US"), i.e. language en and country US.

It seems like Faker normalizes the incorrect locales:

var locale = new Locale("en-us");  // Incorrect locale
var faker = new Faker(locale);

faker.getLocale();  // The correct Locale("en", "US")

This behavior is not reflected in the javadoc. Probably the documentation in README.md could give as an example one of these two:

new Faker(Locale.US).address().zipCodeByState("CA");
// OR
new Faker(new Locale("en", "US")).address().zipCodeByState("CA");

and the list of supported locales could follow the same format as used by Locale::toString, with underscores _ instead of hyphens -:

- ar
- bg
- ca
- ca_CAT
- cs
- da_DK
. . . . .

Another inconsistency with the Java Locale API that I noticed is that java.util.Locale uses ISO 3166 alpha-2 for country codes, and datafaker doesn't. Not everything supported by datafaker can be mapped to ISO 3166 alpha-2, but probably datafaker could recognize ISO 3166 alpha-2 codes whenever possible:

var fakerNp = new Faker(new Locale("en", "NP"));
var fakerNep = new Faker(new Locale("en", "NEP"));

fakerNp.getLocale().getDisplayCountry();  // "Nepal"
fakerNep.getLocale().getDisplayCountry();  // "NEP"

fakerNp.name().fullName();  // "Clora Douglas"
fakerNep.name().fullName();  // "Laxmi Basynat"

As you see, the Java API is not aware of NEP while datafaker is not aware of NE. The same issue arises for IN and IND.

abeshenov avatar Jul 30 '22 21:07 abeshenov