unicode-math
unicode-math copied to clipboard
setminus and smallsetminus
unicode-math defines the following mappings:
0x2216 -> \smallsetminus
0x29F5 -> \setminus
This appears to a bit strange to me. Unicode defines 2216 as "set minus", so shouldn't 2216 map to \setminus
? This is pointed out in Will's tugboat article, but I do not understand the rational behind the choice made my unicode-math.
That might well be one of the errors in the STIX table that Hans recently mentioned. U+29F5 is called the reverse solidus operator in Unicode. Many fonts (but not Cambria Math) use a more slanted version for U+2216, that could be the reason why the STIX team chose U+29F5. Windows 7's math input panel recognizes a set difference as U+2216, too. So both the standard and the reference implementation say that you're right, and if Will has no objections, I'll rename the definition.
I do have objections!
The unicode-math definitions come from the STIX table and match up with the STIX fonts (see below), which are the reference implementation for unicode math fonts. I've already noted in the documentation the inconsistency with 2216 ≠ set minus (according to the traditional TeX terminology) but there's nothing we can do about it unless STIX changes their mind.
If the other unicode math implementations (being only Cambria Math + MS Word at present) do not change then we're in a bit of a pickle and we might need to add an option to the package for this case. (set-minus=TeX
or set-minus=Unicode
, I guess.) Given that we cannot change the names due to compatibility with TeX, do you think we need this option?
\documentclass{article} \usepackage{ifxetex} \ifxetex \usepackage{unicode-math} \setmathfont{XITS Math} \else \usepackage{amssymb} \fi \begin{document} $a\setminus b$ $a\smallsetminus b$ \end{document}
I don't think we should introduce an option for this (and, by extension, for every questionable characters). Probably we should implement ConTeXt-like "typescripts" or at least individual packages for specific fonts: stix.sty
then should contain a fix that sets \setminus = 29F5
. \setminus = 2216
should still be the default because it is prescribed by Unicode and MathML and all other fonts follow the standards here. Since we need lots of font-specific patches anyway (think DisplayOperatorMinHeight
of Cambria), it is probably not a huge burden to write a specialized package for each font.
Yeah, I guess in the future I'm expecting that we'll have in-built defaults for loading certain fonts. (I promised Karl I'd do this for fontspec by TL2011; hopefully that will still be possible even though I haven't looked at the code for months…) I don't think we'll have too many of these questionable characters — I don't mind having the switch in unicode-math rather than in the font loading.
I'm sort of burnt out trying to communicate with whoever is responsible for this stuff. If we can come up with a solid argument that the STIX design is flat-out wrong, then we should get it signed by Hans et al. and forward it on to to everyone concerned.
ConTeXt patches a few fonts before loading them. See $TEXMF/tex/context/fonts
. Most of the files are simply creating virtual Unicode math fonts; only a few of them also patch the font.
In the long term, it better to fix such "bugs" upstream, so that the fonts are usable in other programs (like Word) also. Perhaps it might be easier to ask Khaled to patch XITS?
@khaledhosny please have a look at this. (But ultimately it has to be fixed at the STIX side)
I agree that font bugs should be fixed upstream, but it can take a long time until e.g. Microsoft fixes Cambria Math, so I expect that we have to do a lot of patching anyway provided the fixes are simple enough. I think that for now, I'll set \setminus = U+2216
(Unicode is always right) and write a small xits-math
package to swap them back. OK?
Please note this isn't really a bug in STIX, it's behaviour by design. I'll try and contact Barbara and see what her thoughts are.
"Fixing" XITS is not hard, just give me a concrete bug report (once it is identified as bug) and I'll fix it.
It is not a bug, just a design issue:
- Unicode unmistakably states that U+2216 is the character that denotes set difference
- U+29F5 is the reverse solidus operator
- U+2216 is older and available in more fonts than U+29F5
- there is no small set minus character in Unicode
- in traditional TeX,
\setminus
is more vertical than\smallsetminus
- in all fonts that have U+29F5, the latter is either identical (Cambria) to or more vertical (your fonts, Apple’s fonts) than the former
- I’d prefer to have
\setminus
= U+29F5 because Unicode and MathML say so and I think standards compliance is more important than sticking to LaTeX’s somewhat idiosyncratic conventions - but then LaTeX users could be surprised because this would essentially swap the glyphs for
\setminus
and\smallsetminus
- overall I think Unicode and MathML should always overrule every other consideration else unless there are very good reasons against
- since there is no Unicode character “small set minus” or “small reverse solidus operator”, I don’t really know what to do with the existing glyph for U+2216—maybe it should even be moved to the PUA?
BTW, STIXVar.otf font have a glyph for U+2216 that is identical to U+29F5, so I guess they were aware of the issue, dropping a line to Barbara Beeton asking about the idea behind current glyph selection wouldn't hurt.
We've just bumped up against this incompatibility during development of the STIX Two fonts: https://github.com/stipub/stixfonts/issues/179. (I'm not sure what took us so long.)
Would anyone from here like to comment on that proposal?
Another unexpected interaction I found (clearly a font issue) is with the New CM fonts where both character are actually the small set minus, so I actually have to redefine \setminus
back to the normal definition of \mathbin{\backslash}
.
It is not a bug, just a design issue:
- Unicode unmistakably states that U+2216 is the character that denotes set difference
- U+29F5 is the reverse solidus operator
- U+2216 is older and available in more fonts than U+29F5
- there is no small set minus character in Unicode
Yes and no. Unicode doesn't specify a "small" set minus character, instead it follows TeX (in a way) by standardizing the reverse solidus operator (that TeX was using as a set minus character) plus a set minus operator that may or may not be "small". Consequently, HTML &setminus
refers to U+29F5 and ∖
refers to U+2216.
So, arguably, U+2216 is the "small" set minus, or "semantic" set minus.
- in traditional TeX,
\setminus
is more vertical than\smallsetminus
In fact, in plain TeX it is the reverse solidus operator with additional spacing! Just as the Unicode name suggests.
- in all fonts that have U+29F5, the latter is either identical (Cambria) to or more vertical (your fonts, Apple’s fonts) than the former
- I’d prefer to have
\setminus
= U+29F5 because Unicode and MathML say so and I think standards compliance is more important than sticking to LaTeX’s somewhat idiosyncratic conventions- but then LaTeX users could be surprised because this would essentially swap the glyphs for
\setminus
and\smallsetminus
Indeed, the LaTeX names \setminus
and \smallsetminus
are... unfortunate. But documents shouldn't break that bad when switching to unicode-math, IMHO. However, an option could be provided to have
-
\setminus
emit REVERSE SOLIDUS OPERATOR (U+2216) and\smallsetminus
SET MINUS (U+29F5) "as is". - Have
\setminus
emit U+29F5 and an additional command for REVERSE SOLIDUS OPERATOR (what to do with\smallsetminus
?) - Have both emit REVERSE SOLIDUS OPERATOR, or
- both emit SET MINUS.
The options key could be setminus
and the values latex
, unicode
, backslash
and setminus
.
- overall I think Unicode and MathML should always overrule every other consideration else unless there are very good reasons against
- since there is no Unicode character “small set minus” or “small reverse solidus operator”, I don’t really know what to do with the existing glyph for U+2216—maybe it should even be moved to the PUA?
The existing glyph is the small set minus in virtually any font---as can be observed when comparing it to the $\mathbin{\backslash}$
(i.e., the REVERSE SOLIDUS). It's not necessarily really "small" but definitely differently slanted.
I think the best way forward would be to keep the unicode-math
behavior as-is, adding the above options. Further, the GUST team should probably add a REVERSE SOLIDUS OPERATOR glyph identical to the REVERSE SOLIDUS. That way, missing characters would be gone and all possible behaviors would be covered.
Thanks for the decades of commentary here. I'm sorry that I dug my heels in so much about this -- in hindsight, it would have been better to make the change originally proposed to correspond to Unicode as much as possible.
My solution to this issue is going to be exactly that -- use "2216 for \setminus
, fake \smallsetminus
, and anyone who needs something different can use some one-line redefinitions (such as \renewcommand\setminus{\reversesolidus}
).
Thanks for the decades of commentary here. I'm sorry that I dug my heels in so much about this -- in hindsight, it would have been better to make the change originally proposed to correspond to Unicode as much as possible.
My solution to this issue is going to be exactly that -- use "2216 for
\setminus
, fake\smallsetminus
, and anyone who needs something different can use some one-line redefinitions (such as\renewcommand\setminus{\reversesolidus}
).
I know it's generally preferred to use GitHub reactions to express gratitude, but they easily get lost so I want to take the short time to say thank you for coming around with a (simple) solution—as well as all the other maintenance work.
I'm looking forward to deleting a lot of code from my personal class to make my documents compile w/o missing symbols 0:-)
Thanks again and have a great day!