globalize
globalize copied to clipboard
General purpose API for converting regular numerals to Hebrew ones
According to Hebrew cultural conventions, in some cases numbers are represented by letters of Hebrew alphabet, i. e. instead of Arabic digits, the Hebrew characters having their numbering equivalent are used, like: א in place of 1 ב in place of 2 .............. י in place of 10 כ in place of 20 ................ י"א in place of 11 כ"א in place of 21 (However this rule isn't that straightforward since some Hebrew letters combinations are not acceptable from historic-religious reasons and they are substituted by their equivalents). Fore detailed info please see: http://en.wikipedia.org/wiki/Hebrew_numerals http://www.i18nguy.com/unicode/hebrew-numbers.html http://smontagu.org/writings/HebrewNumbers.html
The most common (but not the only one) use case is the days of month or year stamp, thus Hebrew date picker, for instance, will use כ"א instead of 21 day of month, and תשע"ו instead of 2015
The proposal is to provide functionality for algorithmic transformation of regular numbers to their Hebrew counterparts and vice verso, please note that this functionality has no dependencies on CLDR
I would consider either:
- Utilization of existing 'numberFormatter' function from 'Number module' by adding 'option' value 'Hebrew' (for converting regular [Arabic] numbers to Hebrew) and 'option' value 'Regular' (or whatever) for opposite conversion. or
- Creating brand new function for this purpose.
What do you think? Which approach would be more appropriate?
@rxaviers any comments ? This is part of the effort for supporting non Gregorian calendars in general and in date pickers in particular.
Yes, supporting RBNF is very welcome. Please, make sure that http://www.unicode.org/reports/tr35/tr35-numbers.html#Rule-Based_Number_Formatting is followed.
I would consider either:
- Utilization of existing 'numberFormatter' function from 'Number module' by adding 'option' value 'Hebrew' (for converting regular [Arabic] numbers to Hebrew) and 'option' value 'Regular' (or whatever) for opposite conversion. or
- Creating brand new function for this purpose.
Re-using numberFormatter
seems the best option to me. About the 'Hebrew' option, this is basically called numbering system. Take a look at how the number formatter figures out which numbering system to use here https://github.com/jquery/globalize/blob/master/src/number/numbering-system.js.
Note that for hebrew, the default numbering system is Latin, it means Globalize would use 1, 2, 3, ... for he
. But, the traditional numbering system is Hebrew, which means Globalize would use Hebrew digits for he-u-nu-traditional
. See https://github.com/unicode-cldr/cldr-numbers-full/blob/master/main/he/numbers.json#L12-L16.
For clarity, see the zh-u-nu-native
example at https://github.com/jquery/globalize/blob/master/doc/api/number/number-formatter.md#instance-formatter.
Thank you Rafael for detailed explanation. Eventually I approached the current task and immediately run into the problem regarding which I would like to consult with you. There exist a general problem, that is illustrated, in particular, by Hebrew locale. The CLDR for "he" contains the following: "numbers": { 12 "defaultNumberingSystem": "latn", 13 "otherNumberingSystems": { 14 "native": "latn", 15 "traditional": "hebr" 16 }, 18 "symbols-numberSystem-latn": {
When I initialize the Globalize( "he-u-nu-traditional") and then make call to any 'numberSymbol' an exception is thrown in number.js var numberSymbol = function( name, cldr ) { return cldr.main([ "numbers/symbols-numberSystem-" + numberNumberingSystem( cldr ), The problem is that 'numberNumberingSystem(cldr) ' returns 'traditional' and the whole path results up in none existing 'symbols-numberSystem-hebr' since only Latin (symbols-numberSystem-latn) numbering system exists for Hebrew. The exception itself is thrown in cldr.js where corresponding bundle fails to be found. This rather general (not Hebrew specific) problem may be resolved
- either under current pull request inside Globalize\number.js when on failure we may fall back onto defaultNumberingSystem but, to my humble opinion, this seems to be improper place.
- or in cldr.js var resourceGet = function( data, path ) ....................... node = node[ path[ i ] ]; if ( !node ) { return undefined;
where if we hit last line, the fall back to "defaultNumberingSystem" may be done by fetching it from data.main.he.numbers.defaultNumberingSystem although such code invasion in general purpose API seems me inappropriate as well.
What do you think?
@ashensis when we initialize Globalize( "he-u-nu-traditional")
, it tries to use the Hebrew numbering system (defined by CLDR on line 15 of your code excerpt "traditional": "hebr"
). It happens that Hebrew numbering system is algorithmic and should be provided by RBNF. Therefore:
- I believe we should have an appropriate error message in such cases, saying something like
Algorithmic numbering system ${nu} not supported
. This will require us to hard code a couple of nu strings, but that should be trivial and not too big given http://unicode.org/repos/cldr/trunk/common/bcp47/number.xml. - I don't think we should fall back to the default numbering system. Because, I think the error message better represents what user has requested (a different nu than the default one, but it's not supported).
- I understood the scope of this issue is to provide RBNF support, please correct me if I'm wrong.
Yes Rafael, the scope of this issue is to provide the RBNF support and I am close to finalize the code. However, the problem with numbering system has some bearing on it and it is critical to my humble understanding. Let me clarify the issue.
- It is legal to initialize Globalize( "he-u-nu-traditional").
- To my understanding, the CLDR part of initialization (or its excerpt) provided by https://github.com/unicode-cldr/cldr-numbers-full/blob/master/main/XX... can't be voluntary altered, i.e. they should be used just as they are stored in above mentioned locations. In other words, if, for Hebrew, CLDR contains only Latin numbering system "symbols-numberSystem-latn" one can't change it for "symbols-numberSystem-hebr" in his code on initializing Globalize.
- If my understanding in #2 is correct, we have the real problem and it isn't sufficient to throw the error. Since, how on earth one is supposed to initialize Globalize the way described in #1. The sheer discrepancy between using CLDR "traditional": "hebr" and "symbols-numberSystem-latn" will preclude such usage without dealing with this problem in code.
May be my assumption in #2 isn't correct or I am missing some point?
I am not sure if it was clear enough, so wanted to emphasize it explicitly. The problem @ashensis described above is not functionally blocking for RBNF implementation. The only thing that this implementation needs from Globalize( "he-u-nu-traditional") initialization is "he" + "traditional". This information is obviously available.
The problem is mostly internal to Globalize( "he-u-nu-traditional") and the only correlation it has with RBNF implementation is associated with testing. For testing Hebrew RBNF we need to call Globalize( "he-u-nu-traditional"). This call currently fails for the reasons @ashensis mentioned above.
Thanks for the clarification. I think I understand the problem better now, but what I've said is still valid, which I'll try to make more detailed/clear below.
I assume both he
and he-u-nu-traditional
are being initialized similarly to the code below.
var Globalize = require('globalize');
// Load likelySubtags content, which is the only CLDR requirement for core (e.g., creating instances).
Globalize.load(require('cldr-data/supplemental/likelySubtags'));
// Load numberingSystem and number data, which are CLDR requirement for number.
Globalize.load(require('cldr-data/supplemental/numberingSystems'));
Globalize.load(require('cldr-data/main/he/numbers'));
var he = Globalize('he');
// > Globalize {
// cldr:
// { ee: EventEmitter { _events: [Object] },
// off: [Function],
// on: [Function],
// once: [Function],
// attributes:
// { bundle: 'he',
// minlanguageId: 'he',
// maxLanguageId: 'he-Hebr-IL',
// language: 'he',
// script: 'Hebr',
// territory: 'IL',
// region: 'IL',
// variant: undefined },
// locale: 'he',
// supplemental: { [Function] weekData: [Object], timeData: [Object] } } }
var heTraditional = Globalize('he-u-nu-traditional');
// > Globalize {
// cldr:
// { ee: EventEmitter { _events: [Object] },
// off: [Function],
// on: [Function],
// once: [Function],
// attributes:
// { bundle: 'he',
// minlanguageId: 'he',
// maxLanguageId: 'he-Hebr-IL',
// language: 'he',
// script: 'Hebr',
// territory: 'IL',
// region: 'IL',
// variant: undefined,
// 'u-nu': 'traditio' },
// locale: 'he-u-nu-traditional',
// supplemental: { [Function] weekData: [Object], timeData: [Object] } } }
The he
instance uses the default Latin script, which works just fine using today's Globalize code. The he-u-nu-traditional
instance uses the Hebrew script and it's going to use the new RBNF implementation. But, it happens that part of this new implementation obviously needs access to the Hebrew numeric symbols (retrieved internally by the numberSymbol
function). Although this information exists for the Latin script (i.e., main/he/numbers/symbols-numberSystem-latn
is available), it doesn't exist for the Hebrew script (i.e., there's no main/he/numbers/symbols-numberSystem-hebr
). Should the implementation fall back and use the symbols from Latin? No.
On UTS#35 Part 3. Numbers, section 2.3 Number Symbols, it says:
Number symbols can only be defined for numbering systems of the "numeric" type, since any special symbols required for an algorithmic numbering system should be specified by the RBNF formatting rules used for that numbering system.
I assume it's clear that the answer for your question lies in the RBNF implementation. Please, just let me know if you have any question.
Having said that, as far as I know CLDR doesn't provide RBNF data in the JSON format and that is going to be a problem.
Having said that, as far as I know CLDR doesn't provide RBNF data in the JSON format and that is going to be a problem.
@JCEmmons, @srl295 gentlemen, do you have any information about it please? Thanks
We have had a request for it from another party, and it's on my "to-do" list. You can track via http://unicode.org/cldr/trac/ticket/8772. I'm sure it will take some additional work to the JSON conversion utilities, but I'm hoping I can get this done for CLDR 29.
Thank you!
Forgive me for bugging you more on this subject, but I have to get my head around what you say: UTS#35 Part 3. Numbers, section 2.3 Number Symbols, Number symbols can only be defined for numbering systems of the "numeric" type, since any special etc.
This sounds highly reasonable, but still, to my humble understanding, the current Globalize code have to perpetrate some changes in order to cope with this problem even after RBNF JSON will be available. To be more precise, in case of algorithmic numbering, 'symbols-numberSystem' shouldn't be looked for at all since it isn't supposed to be used. Thus current code have to be altered in this respect, is my understanding correct?
No problem, feel free add as many comments as you want. :)
Yes, for algorithmic numbering systems, that data shouldn't be looked at all; and yes, it's possible some code in the number module may need change.
RBNF in the JSON format: https://github.com/unicode-cldr/cldr-rbnf :tada: