KaTeX icon indicating copy to clipboard operation
KaTeX copied to clipboard

_* parse bug

Open pzinn opened this issue 2 years ago • 8 comments

\:_* produces the bizarre error message: KaTeX parse error: Got group of unknown type: 'internal' same with \, or \! in constrast LaTeX doesn't mind

pzinn avatar Mar 09 '22 11:03 pzinn

Well, the _ symbol indicates a subscript, of course. The KaTeX "supsub" builder is currently seeing an un-expanded macro as its base. Hence the error message. Digging deeper, \; relies on a \tmspace macro, which expands to a \TextOrMath{} macro to determine what value to use for the spacing.

A quick fix would be to wrap the macro for \; in braces. I.e. rewrite: \\tmspace+{5mu}{.2777em} to read {\\tmspace+{5mu}{.2777em}}. And similar wrapping for the other spacing macros.

@edemaine Do you see any drawback to that approach?

ronkok avatar Mar 09 '22 19:03 ronkok

I agree that'd be the quick fix, but I'm a little worried because amsmath.sty doesn't do that:

\renewcommand{\,}{\tmspace+\thinmuskip{.1667em}}
\renewcommand{\!}{\tmspace-\thinmuskip{.1667em}}
\renewcommand{\:}{\tmspace+\medmuskip{.2222em}}

I think a more proper solution would be to allow internal nodes, which I assume is what \TextOrMath produces, as the base for a subscript. Or is this difficult for some reason? (What does _ need to know about its base?)

Even better would be to make \TextOrMath not produce its own parse nodes. This would rely on \ifmmode support via #3385. This got reviewed but is awaiting @ylemkimon's revision I think.

edemaine avatar Mar 09 '22 19:03 edemaine

The error occurs when supsub calls a buildGroup() on the base, because there is no group builder for an internal ParseType, aka an unexpanded macro. I tried to revise buildGroup in a quick but adequate way, but failed. I think you are correct; we need something more fundamental that ensures that macros are expanded before the supsub builder gets to see them.

So the two options on the table are to wait for #3385 or apply a (temporary?) quick fix with braces.

ronkok avatar Mar 09 '22 20:03 ronkok

Actually, I misspoke. I forgot that we already made \TextOrMath work correctly as a macro. \ifmmode isn't relevant.

The issue seems to be the \relax in the definition of \tmspace. \kern3mu\relax_* fails, while \kern3mu_* and \TextOrMath{\kern3mu}{\kern.1667em}_* work fine.

So now I think this issue is a side effect of #3384 (see also #2138 which introduced internal nodes). I wonder if \relax is supposed to build a "null" group, which gets a subscript attached... or if the internal node should get discarded before subscript building.

Answer: it's the latter. For example, {a \over b}\relax_x renders as image, while {a \over b}{}_x renders as image in LaTeX.

edemaine avatar Mar 09 '22 20:03 edemaine

I think at some level the issue is in the logic of Parser's parseAtom. This will treat \relax as the beginning of a new atom with its own subscripts etc., but shouldn't. A simple workaround would be to detect \relax in parseAtom and skip them here, but it'd be nicer to skip all internal nodes (as created e.g. by \def). I'm not quite sure how to code the latter, because we won't know about it until a second parseAtom call... @ylemkimon do you have ideas?

edemaine avatar Mar 09 '22 20:03 edemaine

Having a similar error with $\implies^*$. Temporary workaround is \text{$\implies$}^*.

rdong8 avatar Nov 16 '23 17:11 rdong8

Thanks for the additional report! In all cases, a workaround is to add {} before the sub/superscript, as in \implies{}^*.

edemaine avatar Nov 16 '23 18:11 edemaine

any progress on this?

pzinn avatar Feb 01 '24 20:02 pzinn