typst icon indicating copy to clipboard operation
typst copied to clipboard

Clearer syntax for displaying text/vars in math

Open PgBiel opened this issue 2 years ago • 45 comments

Motivation

Currently, there are a few notable inconsistencies in how math mode handles the distinction between fonts for variables/operators and for text. In particular, the syntax $ "double quotes" $ is used both to introduce multi-character variables (such as in $ "speed" = 5 $), and also to introduce text (such as in $ 6 > 5 "iff" 5 > 4 $). However, the text produced from the double-quote syntax will necessarily use the math font, not the normal text font used along the document. Not even manually inserting #text[...] fixes that (the inserted text still uses the math font); the only way to manually insert the normal text font in math mode is to use #text(font: "The text font here")[...], which is very verbose.

Additionally, the double quotes syntax behaves a little weirdly right now: "a" is displayed italicized, while "aa" is upright (see #274).

Therefore, the purpose of this RFC is to hear opinions regarding what the best syntax would be to introduce normal text inside math mode (using the document's or context's text font), and to make it separate from math font text (variable/operator names and stuff), thus potentially fixing many other minor issues in the process.

We already had some discussion regarding this over Discord, and I'll try to show some of the suggestions sent there here.

Proposed alternatives

1. Use text() for normal text font

This possibility suggests that all math mode text should default to the math font, including text inserted with double quotes, as it is now. However, a math.text function would be added, letting you use $ 6 > 5 text("iff") 5 > 4 $ to enforce the document text font.

  • Pros:
    • Syntax would be familiar to LaTeX users (where it's \text{})
    • Pretty clear that it's the text font (it's called "text" after all)
    • Fully backwards-compatible (everything written in the old syntax would remain exactly the same)
  • Cons:
    • Still a bit verbose (and the "..." syntax has been quite convenient)
    • Could be annoying nonetheless for those who don't come from LaTeX
    • Wouldn't fix "a" not being upright (might be something that could be fixed in other ways though)
    • Would be confusing, since #text() would revert to math font (since that would be the default text element, not the math-specialized one, due to how the internals work), and overall the logic would look backwards (a user could ask: why does the default text element produce math text in a math environment, while math.text produces normal text?).
    • User could be confused regarding why " " and text(" ") are different.

2. Use var() for math font

This possibility suggests that all math font text should, instead, be wrapped in a special element called var, and all double-quoted text in math mode would default to the document/context text font, unless wrapped in var. Sample usage would be $ var("speed") = 5 $ (note that var(speed) wouldn't work, as it would try to read speed as a variable or function). Single-letter variables (e.g. $ e $) would be implicitly wrapped in var by default (and operators such as sin or op("...") would also be in math font by default). In contrast, writing $ "speed" = 5 $ would display "speed" in the document text font, not in the math font.

  • Pros:
    • Would be explicit regarding what uses math font and what doesn't (var() is pretty clearly a variable, and " " is pretty clearly text)
    • Would make it easier to control the math font, without having to set the font for the whole equation (something like #show var: set text(font: "New Math Font") could be made to work).
    • Would make math mode less "magic": instead of having all content be magically interpreted and represented as math when inserted in math context (which also leads to issues like #274), only content explicitly wrapped in var would look like math.
    • This would also lead to cleaner internal code (non-var things would not require much special treatment in equations), and could make it eventually possible to merge text layouting and math layouting procedures.
    • Could potentially allow math font in text without having to insert things into equations (or $ $).
  • Cons:
    • var("something") seems a bit annoying to type compared to just "text" for text. (See the next alternative for a possible solution here.)
    • Could be potentially confusing to use at first, due to the new or altered syntax that users (new or not) would have to learn. Should be solvable with proper docs and communication.
    • Could inherit the redundancy with upright() for multi-character variables (though that's just the status quo).
    • Could also be seen as redundant with op() (but it does have spacing implications, while var would just set the font). Shouldn't be a problem, as it would be a similar distinction between upright() and op().
    • Wouldn't be fully backwards-compatible (anything using double quotes would change to use the document text font).

3. Add @variablename shorthand syntax for var("variablename") (preferred, but debatable)

This possibility is a complement to the previous one: not only do we use var("...") for math font and just "..." for text font, but we also introduce shorthand syntax for simple uses of var("..."), a.k.a. syntax which is just sugar for (is replaced with) var("..."), to make it less verbose. Initially, we proposed the @ syntax (up for debate). That is, one would be able to write something like $ @speed = 5 $ (for example), which would be replaced with $ var("speed") = 5 $, and would thus have speed be written in math font (in contrast with $ "speed" = 5 $ for text font). For more complex uses, one will still be able to use $ var(#[very complex text]) = 5 $.

  • Pros:
    • All the pros for var("...") (as it's an extension to that proposal).
    • Shorter syntax for simpler use-cases (@speed for a multi-character variable).
  • Cons:
    • All the cons for var("...") (except for verbosity).
    • Could be confusing with the currently existing @ref syntax, but it's probably fine as one isn't expected to cite things in math mode anyway (and, if one really wants to do so, just insert it through #[...]).
    • Not yet clear what would happen if the user tried to do @"a b c" or @[a b c] or something like that. (Up for debate.)

Next steps

Please leave your opinions below. In particular, we need some thoughts regarding:

  1. Which alternative above to pick;
  2. Any additional pros and cons which weren't considered above;
  3. If we pick the preferred one above (the shorthand syntax), then which syntax should we use (@variable or something else);
  4. If we do use @syntax, then whether we should allow things like @[aaa bbb] or @("aaa bbb") or @"aaa bbb", or what should happen in case the user specifies one of those (throw some error, display with spaces, just literally insert the @ there [could be annoying to parse], ...).

Thanks for reading!

PgBiel avatar May 06 '23 22:05 PgBiel

I really like the idea of var(), specially if it makes the internal code simpler. Although I'm wondering if it would be possible to use single quotes for the shorthand syntax, like $ 'speed' = 5 $. That way text and math font would be similar, but distinct.

Otherwise, the @ shorthand looks good.

AndyBarcia avatar May 07 '23 07:05 AndyBarcia

@AndyBarcia Thank you for the feedback! We actually considered single quotes at first, but we realized that this would quickly conflict with a possibly large number of equations, as prime (') is very used in math for e.g. derivative, alt variable names, etc. (It could also be a bit confusing regarding which is which, but that's a smaller issue.)

So we settled on the @ for now, but we'll gladly accept more opinions regarding this.

Thanks again!

PgBiel avatar May 07 '23 09:05 PgBiel

I prefer the var("name") solution because it's consistent and easy to understand. I think we can implement it first, and decide if we want the @ solution later.

var("something") seems a bit annoying to type compared to just "text" for text. (See the next alternative for a possible solution here.)

This will not be a big problem. A variable often occurs multiple times, and people can do let name = var("name") to save key strokes.

peng1999 avatar May 07 '23 14:05 peng1999

Thanks for pouring thoughts into this! Based on my experience with LaTeX, having to use \text{} to wrap texts in math expressions has been a huge hassle. Therefore, I'd prefer keeping the clean "..." syntax for text font.

Meanwhile, I don't have an opinion on variable names. Therefore, for the sake of avoiding ambiguity, I'd prefer the second proposal (i.e. var() for variables) and its enhancement (i.e. the @.. shorthand).

btw, for convenience, let's number the proposals? Or name them with unrelated nouns ("proposal Tiger", "proposal Lion"...) if you want to be playful :)

Leedehai avatar May 07 '23 15:05 Leedehai

I prefer the var("name") solution because it's consistent and easy to understand. I think we can implement it first, and decide if we want the @ solution later.

I agree var() itself should be implemented before the shorthand syntax, though I think it should be released in typst together with the syntax (if we add it). I think it's important to make it clear what we want out of this from day one.

This will not be a big problem. A variable often occurs multiple times, and people can do let name = var("name") to save key strokes.

While I guess that's possible, I think it's a good idea to have some syntax to make it easier for new users, and @ is largely unused in math. Just like how people are often annoyed by \text{}, they'd probably be annoyed by having to do \var{} (or var("...") too. At least that's what I thought of.

Thanks for the feedback!

PgBiel avatar May 07 '23 17:05 PgBiel

Thanks for pouring thoughts into this! Based on my experience with LaTeX, having to use \text{} to wrap texts in math expressions has been a huge hassle. Therefore, I'd prefer keeping the clean "..." syntax for text font.

Gotcha! I agree, it has always been annoying for me in LaTeX as well. We should definitely prioritize simplicity here, IMHO.

Meanwhile, I don't have an opinion on variable names. Therefore, for the sake of avoiding ambiguity, I'd prefer the second proposal (i.e. var() for variables) and its enhancement (i.e. the @.. shorthand).

Got it!

btw, for convenience, let's number the proposals? Or name them with unrelated nouns ("proposal Tiger", "proposal Lion"...) if you want to be playful :)

Done, thanks for the suggestion!

PgBiel avatar May 07 '23 17:05 PgBiel

  • I like option 3 (option 2 + shortcut) most :)
  • I also like the @ symbol for the shorthand. The shorthand is nice to have but in my opinion not 100% necessary. At least in all the stuff I write (math, physics, cs related) I don't use multi letter variable names that often.
  • I am still confused about what the difference would be between var() and op().
  • I don't worry much about backwards compatibility for now. We're still in beta. Time to make things right.
  • I think anything written with the shorthand should include no spaces and no numbers etc. So no @[aaa bbb] or @("aaa bbb") or @"aaa bbb". Just @([A-Za-z]){1,} (regex)

StrangeGirlMurph avatar May 07 '23 18:05 StrangeGirlMurph

I like both the var() and the shorthand @... I rarely see the use of multi-character variables in text. I would say there is maybe one equation that has them for dozen of equations without them. Therefore I would be against proposal 1. Text mixins are much more common and having body text font set with just ".." is a good shorthand.

owiecc avatar May 08 '23 06:05 owiecc

A further consideration: Currently, because math variables and numbers are also text, the text function can be used to configure font, size, color, etc. of math text. We use show math.equation: set text(font: "New Computer Modern Math") and that is what lets math + any nested text in math use the math font.

How would this work under these proposals? Is it set math.equation(font: "..")? Or set math.var(font: "..")? What about op or numbers? And how would you turn a variable red?

laurmaedje avatar May 08 '23 09:05 laurmaedje

How would this work under these proposals? Is it set math.equation(font: "..")? Or set math.var(font: "..")? What about op or numbers? And how would you turn a variable red?

Those are some important questions indeed. To be honest, what I thought of was mostly to use something like the existing show math.var: set text(font: "Something") for this purpose. We could "reset" the styles when entering math.var in e.g. the Finalize step (to make overriding possible), but we could also make that optional (tbh, in most situations where my text was e.g. bold, I wanted my math to be bold too, and stuff like that, so it's not that useless of a thing, I think - but "resetting" would be a default closer to LaTeX's).

Of course, this could make show math.equation: set text(font: "...") not work for this specific purpose, as math.var would reset the styles. But, still, one could always just #[ #show math.var/* ... */;$ equation here $ ] to restrict a show applying on all vars to a single equation. (This could certainly become better in the future if we get in selectors or similar.)

Well, the other option is to duplicate all style options on math.var, but I'm not sure if that's at all optimal. Thoughts?

PgBiel avatar May 13 '23 01:05 PgBiel

And I think this kind of discussion should be the focus of an eventual PR that implements this kind of thing. Following a bit of @peng1999 's suggestion, the parser work for a shorthand @ syntax could be implemented through a separate PR, either in parallel (by someone else) or after the first PR, to avoid worrying too much about that at first.

PgBiel avatar May 13 '23 01:05 PgBiel

I like the var(..) proposal but what would happen in the following case :

let name = @"name"
$name = 5$

would @ be a reference there or the shorthand for var ?

astrale-sharp avatar Jun 15 '23 11:06 astrale-sharp

I like the var(..) proposal but what would happen in the following case :


let name = @"name"

$name = 5$

would @ be a reference there or the shorthand for var ?

The @ shorthand syntax, much like the math fraction syntax (a / b), would only work inside math mode, so your example would likely result in a syntax error (as it does today).

PgBiel avatar Jun 15 '23 12:06 PgBiel

Single quotes have been suggested, but I haven't seen any mention of backticks, which currently don't do anything in math mode and also very rarely appear in equations.

yaksher avatar Oct 16 '23 22:10 yaksher

If anything, I think backticks should be a shortcut for math.mono in math.

MDLC01 avatar Oct 17 '23 08:10 MDLC01

If anything, I think backticks should be a shortcut for math.mono in math.

The typewriter symbols are extremely rarely used in math

Enivex avatar Oct 17 '23 20:10 Enivex

I have to disagree here. In computer science, they are often use in mathematical formulae when referencing a variable or a function that is defined in code.

Although I agree my original message was maybe a bit too categorical.[^1] A better wording would be that, to me, backticks are really strongly associated with code / monospace fonts.

[^1]: Not sure this is the right word, english is not my native language. What I mean is I made it seem like using backticks that way was the obvious only choice.

MDLC01 avatar Oct 17 '23 20:10 MDLC01

In my experience with CS theory, when mixed into math expressions (e.g. as function names), those function names are more commonly rendered in small caps or sans serif font than in monospace, which is only really used for entire code blocks. Additionally, I do think the ability to have multi letter math vs text in math mode is very frequently useful in math and CS theory alike and so even if the syntax is somewhat more natural to map to monospace, I don’t think it’s worth doing so at the expense of something more useful.

yaksher avatar Oct 18 '23 18:10 yaksher

I agree with most every point here! How charming ^^

My main additions are outlined in the almost rant-like block of text below:

I low-key despise the @ syntax. It seems very offhand and I don't think enough options have been discussed, it also introduces a new sigil in maths, which I find less agreeable, especially with the sigil already having intuitive meaning elsewhere.

I also imagined the ` syntax myself, and it seems to me like a better alternative. However, if we already have this symmetry between our two options, I believe one should desugar into text and the other into math.text, where math.text is a newtype-esque function mirroring text but with its own overrides (like the typeface), so in practice it will function closer to a set rule. This opens up the consideration as to which syntax matches to which text style, which I would propose doing in the opposite way from that suggested above, with backticks inserting non-maths text. This option is mostly advantageous in that it preserves existing code and users' experience, and I personally find it more logical (text by default in equations is maths text, only "odd" backtick text is back to default text).

Additionally, this option is in fact not incompatible with var and I find it would complement it; all that is needed to teach is that var and op are functions that help in styling variables and operators, and nothing more. One should then always use the one semantically matching their use-case (if any at all) — for example, op("iff") and op("sin"), but var("speed"), or in fact, if you so wish, var(`speed`) for the document font. This also allows var to always use italics, because for non-italic maths text you just use ""; this is especially relevant as I have seen people ask about italic multi-letter variables numerous times.

I'll conclude this point with a hopefully short guide-level summery: Maths mode has its own text function, math.text. If you need to change maths text properties; set it instead of text. The only way to enter verbatim text in maths mode is within double quotes (such as "Hello, world!"). Any valid Typst identifier will be interpreted as such and most punctuation is converted to operators as they're commonly used in maths (this is actually what you usually want). If you want to enter non-maths text in maths mode (for example, for margin notes), you may escape into code mode with #, or use the shorthand syntax `text here`, which is equivalent to #text("text here"), you may escape backticks with \`within such strings. To typeset operator-like entities, like sin, use the math.op function. The function accepts any content, including maths-text; so you may use $op("sin")$ To typeset variable-like entities, like speed, you may use the math.var function. It functions as a way to override certain features of text; by default, it only makes text italic, but you may also make it use a smallcaps font, for example. Thus use var("speed") for an italicised variables Typst automatically defines a variable for each one-letter identifier, expanding to its own representation as a var. For example; $x$ == $var("x")$. These cannot be overridden from outside–

Which makes me think: should they be? Sometimes it's useful, and it would certainly be very consistent, if we actually change the implementation to use variables. Also, this was not short. :sweat_smile:

emilyyyylime avatar Oct 25 '23 03:10 emilyyyylime

Hello, thanks for your input!

I low-key despise the @ syntax. It seems very offhand and I don't think enough options have been discussed, it also introduces a new sigil in maths, which I find less agreeable, especially with the sigil already having intuitive meaning elsewhere.

I believe criticism regarding the @ syntax is fair, and I think having special syntax for var needs more discussion. Overall, I think the main idea behind having special syntax was to have math text be as easy to type as " ... " in math mode. However, it is actually debatable whether that is actually necessary, as having variables with multiple letters is pretty rare anyways (see here for previous discussion in this regard: https://github.com/typst/typst/pull/1779#issuecomment-1654117885).

So, I think there is no problem in implementing the proposed split (between text and var) without defining special syntax for var, in principle, for an MVP of this feature. That doesn't, however, completely eliminate having dedicated syntax from the equation (see what I did there?), as it could have other uses, e.g. making regex easier to specify (see https://github.com/typst/typst/pull/1779#issuecomment-1657270502). But that'd only be decided by tracking actual usage of var (after its actual implementation).

I also imagined the ` syntax myself, and it seems to me like a better alternative.

I think it is potentially confusing as it's used for raw blocks in both code and text mode (there's a bit of a 'majority' here). But worth keeping an eye on anyways.

However, if we already have this symmetry between our two options, I believe one should desugar into text and the other into math.text, where math.text is a newtype-esque function mirroring text but with its own overrides (like the typeface), so in practice it will function closer to a set rule.

That is definitely an interesting perspective, but I think this would probably just remove the purpose of var then, as variables can be replicated with italic already in terms of style. The point of var is precisely to be the "basic building block" for written things in math mode, so it'd already take that spot of what you call math.text. Hope that clears it up.

(With custom user elements in the future, having more semantic things without depending on the standard library will be much more viable too, so we don't have to worry too much about that :slightly_smiling_face:)

Which makes me think: should they be? Sometimes it's useful, and it would certainly be very consistent, if we actually change the implementation to use variables.

That'd be the basic idea - var would render as an italic letter when the text inside it has just has one letter. That's sorta how it works currently anyways (any text content with just one letter is made italic when displayed in a math mode context), except that, with var, only math text would have that effect (rendering arbitrary content wouldn't be affected by those shenanigans anymore), which is a very important part of the reason for var to exist in the first place.

But that particular point could benefit from further discussion.

PgBiel avatar Oct 25 '23 04:10 PgBiel

I really like the idea of either single quotes or backticks as special var() syntax, but agree that single quotes are problematic for primes (where x prime is written as x') and backticks are problematic since they should match normal markup in creating raw text.

Could a compromise be to use double backticks as the multi-letter var() syntax?
(and maybe use upright text for single letters in them?)

Examples: (equations taken from a physics textbook)

$ "document-font text"  ``math-font text``  `mono-font text` $

$ ``Stress`` / ``Strain`` = ``Elastic modulus`` $

$ v_``rms`` = sqrt((v^2)_``avg``) = sqrt((3 k T)/m) $

$ K_``Carnot`` = T_``C`` / (T_``H`` - T_``C``) $  // if ``X`` uses upright text for single letters

Also note: Looking through my physics textbook for these examples, it seems the primary usage of text in an upright math font is for clarifiers in the subscript of a variable. E.g. "K_Carnot", "v_rms", "E_Total". Any special syntax should optimize for this case.

  • Pros:
    • Easy to read in-line
    • Allows text with spaces and other characters
    • Easier than typing var("text")
    • Could form the start of a good way to solve the regex show issues in #1779:
      • Math text: show ``V``: $arrow(``V``)$
      • Raw text: show `variable`: `longer_variable`
  • Cons:
    • More characters than "@"
    • They might be confused with single backticks, but the syntax is currently not in use!
    • ~Creates a different meaning for double backticks in math mode vs. normal markup~
      • ~They're currently used for inline raw text that includes backticks so you don't have to escape those backticks~
      • ~Maybe double backticks in markup are rare enough that this is ok?~
      • ~Or maybe double backticks in normal markup could become syntax to use the math font and only single and triple backticks would be used for raw text~

EDIT: I didn't double-check how double backticks work in normal markup and assumed they work the same as in markdown. But in Typst they currently just create "empty" raw text in normal math mode so text and ``text`` are equivalent. The second does not render in a mono-font. Is this free real estate?

https://typst.app/docs/reference/text/raw/ [The raw() function] also has dedicated syntax. You can enclose text in 1 or 3+ backticks (`) to make it raw. Two backticks produce empty raw text.

LectronPusher avatar Nov 03 '23 05:11 LectronPusher

$ "document-font text" math-font text mono-font text $

@LectronPusher For a few days now I have been thinking about a solution for this. At least from a personal point of view as a user, your solution seems quite simple and practical to me if it were possible to implement it.

Atreyu-94 avatar Nov 10 '23 19:11 Atreyu-94

I suppose that ship has long sailed but there is another alternative:

  • Use the hash syntax for all typst variables and functions, single letter or not
  • Quotes serve to display text (using the document font etc)
  • Normal text is interpreted as math variables, single letter or not
  • In the rare case where you want to display a math variable that has spaces in its name, use a var function

I know it's probably not going to happen, but the differences in math vs document syntax wrt typst variables and functions is really jarring to me, and this solution solves th two issues at once.

LoicGrobol avatar Mar 16 '24 15:03 LoicGrobol

  • Use the hash syntax for all typst variables and functions, single letter or not

Categorically disagree with this. As someone who used LaTeX before coming to Typst, not needing to do this is one of Typst biggest advantages over LaTeX and it speeds up writing math a lot. Multiletter variable names are much much rarer than functions/variables in math (because the latter is the majority of what you write).

yaksher avatar Mar 16 '24 21:03 yaksher

Using hashtag always also has the problem that we cannot distinguish math function calls from normal function calls anymore. As such, the argument list would always be in code mode, which would be very inconvenient.

laurmaedje avatar Mar 16 '24 21:03 laurmaedje

@yaksher Yeah, conversely, as someone who still uses LaTeX as well as Myst and quarto, I find it really confusing that function calls and regular text look the same: when you mix text and code, it should be very obvious which is which. The single letter/multiple letter distinction doesn't help, as it's very idiosyncratic. Code is read much more often than it's written, so I consider having to type a single character a very small price to pay for the increase in readability.

But as I said, I guess this ship has already sailed. Either I'll get used to it or I'll just stop trying typst, nbd.

LoicGrobol avatar Mar 16 '24 21:03 LoicGrobol

It’s clear what’s code and what’s text? The rule isn’t opaque. But also, “code is read more often than it’s written” is just false with a typesetting language like Typst. Sure, somewhat true with complicated libraries, but those are written in code mode, not math mode.

yaksher avatar Mar 16 '24 21:03 yaksher

It looks like @speed is out of favor already but here's another argument against it:

@speed not only looks like a reference (so creates a risk of confusion), it's also stealing that syntax. References in math are useful e.g. as annotation on an "imply" symbol as in x=1 ===>^@eq1 y=2. I think this currently requires ===>^#text[@eq1] plus a bunch of styling to get the reference to look right? It would be great to have a plain @eq1 work in math mode and shown as (1) by default instead of Equation 1.

I like var() a lot. Here's another idea for a shorthand:

  • A single backquote ` prefix would indicate a variable so we can write $ `speed = 5 $, while `A` (with closing backquote) would be shown as monospace math. The parsing rule could be something like this: if a backquote is followed by an "end of word" backquote on the same line it means monospace math, otherwise it's a variable.

This would allow the following:

$ `speed = 5 $

$ `Stress / `Strain = var("Elastic modulus") $  # two-word variable requires var()

$ v_`rms = sqrt((v^2)_`avg) = sqrt((3 k T)/m) $

$ K_`Carnot = T_`C / (T_`H - T_`C) $

(The ` prefix has precedent in Lisp at least)

knuesel avatar Apr 06 '24 15:04 knuesel

Using ` both as a prefix and for enclosing things sounds like a recipe for disaster

Enivex avatar Apr 06 '24 15:04 Enivex

I think you're right :)

knuesel avatar Apr 07 '24 18:04 knuesel