Unicode multiplication sign not a binary operator
The symbol × is not properly rendered: ${a × b} \neq {a \mathbin{×} b}$. The fix used here is a × b ↦ a \mathbin{×} b.
Obviously this isn't a huge problem, but it's not ideal either—I've run across some other examples of symbols which should obviously be mathbin or mathrel but whose Unicode character is not classed correctly by MathJax.
In the meantime, can someone help with these questions:
- How to tell MathJax to always use a certain math class for a symbol?
- Is there a way to determine which class MathJax is using for a given symbol?
MathJax doesn't go out of its way to try to handle unicode input, so this is one of those cases where things aren't working as one would like. For characters that aren't specifically defined as having a meaning in MathJax (like ^ and _), there are two places that determine what TeX class is used: the RANGES list at
https://github.com/mathjax/MathJax-src/blob/8565f9da973238e4c9571a86a4bcb281b1d98d9b/ts/core/MmlTree/OperatorDictionary.ts#L83
and the operator dictionary (OPTABLE) starting at
https://github.com/mathjax/MathJax-src/blob/8565f9da973238e4c9571a86a4bcb281b1d98d9b/ts/core/MmlTree/OperatorDictionary.ts#L171
The RANGES list breaks down the unicode codepoints into groups of characters that are given the same TeX class and what MathML element type to use for them. This is a rough grouping, and is not a perfect mapping of characters to classes, but is meant as a means of getting most things into an acceptable form. This allows, for example, Latin and Greek letters to be put into mi elements, while arrows can be put into mo elements. Since these are broad groups of characters, not every one in these ranges may be properly typed.
In your case, the times symbol is U+00D7, and that ends up falling within
[0x00C0, 0x024F, TEXCLASS.ORD, 'mi'], // Latin-1 Supplement, Latin Extended-A, Latin Extended-B
which is mostly ranges of Latin letters, but does include a few symbols, like U+00D7 and U+00F7, which are not separated out. This means these two will get class ORD and be placed in an mi, as though they were letters like all the rest of that block. That is why you are getting the wrong spacing.
It would be possible to refine the RANGES list to handle these two characters better, but there is another approach that can be used, which relies on the operator dictionary. The operator dictionary is how MathML decides what the spacing should be around the various operators in mo elements. MathJax augments this to include what TeX class to use for each entry in the table. It turns out that there are entries for both U+00D7 and U+00F7, but because the operator table only applies to mo elements, and the RANGES table puts these two characters into mi instead, the table never gets used for them.
So an alternative to adjusting the RANGES table is to have the getRange() function first check whether the character being looked up is in the OPTABLE, and if so, return the proper TeX class and indicate that and mo is needed, otherwise look through the RANGES table for the value to use. That way, anything for which MathJax already has better data via the OPTABLE will automatically be placed in an mo, even if the RANGES had some other node type.
A MathJax configuration for v3 that does that is the following:
MathJax = {
startup: {
ready() {
const OperatorDictionary = MathJax._.core.MmlTree.OperatorDictionary;
const {getRange, OPTABLE} = OperatorDictionary;
OperatorDictionary.getRange = function (text) {
const def = OPTABLE.infix[text] || OPTABLE.prefix[text] || OPTABLE.postfix[text];
return (def ? [0, 0, def[2], 'mo'] : getRange(text));
}
MathJax.startup.defaultReady();
}
}
}
This will be added in v4 (I am making a PR for it), but the code above will not work with v4. For v4 (now out in beta), you could use
MathJax = {
startup: {
ready() {
const {RANGES} = MathJax._.core.MmlTree.OperatorDictionary;
const {TEXCLASS} = MathJax._.core.MmlTree.MmlNode;
RANGES.splice(
2, 1,
[0x00C0, 0x00D6, TEXCLASS.ORD, 'mi'],
[0x00D7, 0x00D7, TEXCLASS.BIN, 'mo'],
[0x00D8, 0x024F, TEXCLASS.ORD, 'mi']
);
MathJax.startup.defaultReady();
}
}
}
To special case U+00D7 into an mo, which then causes the operator dictionary values to be used. One could handle U+00F7 similarly, if needed.
Awesome, thanks for the explanation. And thanks for including the v4 workaround as well! (i just started using it a couple days ago)