globalize
globalize copied to clipboard
Month suffixes in dateFormatter double in japanese
I don't know the exact source, but here's roughly a reduced test case:
// Setup
const Globalize = require('globalize');
Globalize.load( require( "cldr-data" ).entireSupplemental() );
Globalize.load( require( "cldr-data" ).entireMainFor('ja') );
Globalize.locale('ja-JP');
const formatterOptions = {
skeleton: 'MMMMEEEEc',
};
// Create a japanese formatter.
const f = Globalize.dateFormatter(formatterOptions);
// Format a japanese date
const japaneseDate = f.format(new Date());
// Setup en-US data
Globalize.load( require( "cldr-data" ).entireMainFor('en') );
Globalize.locale('en-US');
// Create an english formatter
const f2 = Globalize.dateFormatter(formatterOptions);
const englishDate = f2.format(newDate());
console.log(japaneseDate);
// 8月月8日(火曜日)
console.log(englishDate);
// Tuesday, August 8
The symbol in Japanese for month
is 月
, so in the Japanese output we're outputting the month symbol twice.
Expected: 8月8日(火曜日)
Actual: 8月月8日(火曜日)
But you'll note that this is not the case for the english date (probably since there's no concept of a month symbol). I assume any locale with a month symbol would be affected by this.
In the case of Japanese, I can temporarily work around this by using MMMMM
instead of MMMM
(5 Ms instead of 4 Ms). Five Ms actually means 'shorthand', even though 5 > 4. In the case of Japanese, 'month' is only a single character. So the 'shorthand' is the same as the 'longhand'. So with the skeleton MMMMMEEEEc
I actually get the expected value out. This would not work for a language with a month postfix that was not the same in long-hand and shorthand though.
It also means I have to change the skeleton based on the current locale, which defeats the purpose of the generic formatter a littler bit.
Hopefully that's a helpful starting place.
Best, Alex
Hi @SlexAxton thanks for filing this issue.
The source of the problem is that {skeleton: MMMMEEEEc}
is being wrongly resolved into pattern MMMM月d日(EEEE)
, but I believe the source of confusion isn't globalize code, but CLDR data, I explain it in details in the CLDR ticket I just filed: http://unicode.org/cldr/trac/ticket/10540
By the way, a quick workaround would be to amend CLDR data like that:
// Quick workaround for now.
Globalize.load({
"main": {
"ja": {
"dates": {
"calendars": {
"gregorian": {
"dateTimeFormats": {
"availableFormats": {
"MMMEEEEd": "MMMd日EEEE"
}
}
}
}
}
}
}
});
Globalize('ja').dateFormatter({skeleton: "MMMMEEEEd"})(new Date());
// > '8月9日水曜日'
Globalize('en').dateFormatter({skeleton: "MMMMEEEEd"})(new Date());
// > 'Wednesday, August 9'
Note you should use MMMMEEEEd
skeleton (note the d
instead of c
).
Why d
instead of c
if I may ask?
Sure, I assume you want:
-
MMMM
: long month name -
EEEE
: long weeday name -
d
: the numeric day of the month
Looking at your CLDR ticket, would it make sense to add this missing pattern - "MMMM": "M月" to dateTimeFormats instead of adding complete skeletons?
Note that this also applies to "MMM" dates and to Chinese:
const ianaTzData = require('iana-tz-data');
const Globalize = require("globalize");
const cldrData = require('./vendor/cldr-data.json');
Globalize.loadTimeZone(ianaTzData);
Globalize.load(cldrData);
function printDateTime(locale, skeleton) {
const dateFormatter = Globalize(locale).dateFormatter({ skeleton: skeleton })
console.log(locale + '\t', skeleton + '\t', dateFormatter(new Date()))
}
printDateTime('ja', 'yyyyMMMMdjmm');
printDateTime('ja', 'yyyyMMMdjmm');
printDateTime('zh', 'yyyyMMMMdjmm');
printDateTime('zh', 'yyyyMMMdjmm');
// ja yyyyMMMMdjmm 2017年9月月7日 13:33
// ja yyyyMMMdjmm 2017年9月月7日 13:33
// zh yyyyMMMMdjmm 2017年九月月7日 下午1:33
// zh yyyyMMMdjmm 2017年9月月7日 下午1:33
According to comments on the unicode ticket that was opened, the data is fine, but the rules of the spec have not been followed by the implementation. Comment 3 indicates clarification was added to the spec.
So what's the status of this one? Is it a cldr issue?