(Opinions needed) Likely upcoming changes to math mode precedence

We are planning to make changes to operator precedence in math mode: Specifically, with this change something like f_i(x) or f_pi(x) would not put the parenthesized part into the subscript anymore. In exchange, f_abs(x) would need to be written as f_(abs(x)) because there is an inherent syntactical ambiguity.

A range of other options have been explored. I summarized all of them and explained my own thoughts in more detail in this blog post: The Math Mode Problem | Laurenz's Blog.

If you are using Typst’s math mode, I encourage you to read the blog post and if you have an opinion one or the other way to comment on this thread. This way, we can get a better idea of what users of Typst think about this and make a more informed decision. Thank you!

Here are a few links to further places where this is discussed: GitHub PR, Discord Forge Thread, Reddit.

16 Likes

Here’s a quick illustration of what’s going on here.

Code
#set raw(lang: "typm")

#grid(columns: 2, gutter: 2em)[
  = Typst #sys.version

  #table(
    columns: 3,
    align: (start, center, center).map(a => a + bottom),
    stroke: none,
    table.header()[*Write*][*Get*][*Feel*],
    table.hline(stroke: 0.5pt),
    ..(
      (`f_1(x)`, [😃]),
      (`f_i(x)`, [👻]),
      (`f_i (x)`, [🤨]),
      (`f^pi(x)`, [😐]),
      (`f^pi (x)`, [😐]),
      (`e^abs(x)`, [😃]),
      (`e^(abs(x))`, [😀]),
    )
      .map(((expr, feel)) => (expr, eval(expr.text, mode: "math"), feel))
      .flatten(),
  )
][
  = Next version #text(0.5em)[(likely)]

  #table(
    columns: 3,
    align: (start, center, center).map(a => a + bottom),
    stroke: none,
    table.header()[*Write*][*Get*][*Feel*],
    table.hline(stroke: 0.5pt),
    ..(
      (`f_1(x)`, [😃]),
      (`f_i(x)`, $f_i (x)$, [😃]),
      (``, []),
      (`f^pi(x)`, $f^pi (x)$, [😐]),
      (`f^(pi(x))`, $f^(pi(x))$, [🙂]),
      (`e^abs(x)`, $e^abs (x)$, [🤨]),
      (`e^(abs(x))`, [😀]),
    )
      .map(row => if row.len() == 2 {
        let (expr, feel) = row
        (expr, eval(expr.text, mode: "math"), feel)
      } else {
        let (expr, result, feel) = row
        (expr, result, feel)
      })
      .flatten(),
  )
]
16 Likes

Thank you for this post! I did not follow the whole history on the math forge, but I believe your post summarised the current issue quite well.

tl;dr: I agree with option B.

Speaking from a mathematician’s POV, I think you’re quite right to point out that the ambiguity between a symbol and a function (Typst or mathematical) can never be completely resolved. Whether pi in pi(x) is pi x or pi: x |-> pi(x) is completely contextual.

Regardless of LaTeX, I think most users expect f_i(x) to render as f_(i) (x). I have had to repeatedly correct my equations to fix this. Even though it’s a breaking change, it is to be expected. I would rather this is changed now than later.

IMO, this is the right way to go especially if Typst can warn users about script ambiguity with functions.

8 Likes

(I do generally think you are tending towards the best option)

If I understand correctly, in the same vein the (likely) future behaviour will be that

  • e^abs(x) produces eabs and a warning
  • e^sin(x) produces esin and no warning

The difference is that abs is a function and sin a math.op. Is this understanding correct? I wonder if there’s anything that that can reasonably be done to improve that case - the viewpoint being that both are undesirable

Regardless of LaTeX

By the way, actually neither option A nor B is exactly the same as LaTeX.
The behavior of LaTeX depends on the macro packages used, and they can vary from one another.

Code
\usepackage{physics}  % provides `\abs`

\begin{document}

\begin{align}
  f_i(x) \\
  f^\pi(x) \\
  e^\abs{x} \\
  % e^\sin(x) — error: Missing { inserted.
  e^{\sin{x}} \\
\end{align}

3 Likes

A text operator could feasibly also warn if there is no real use for this, but I’m not sure whether that’s the case. For something like f_min(x), it might make sense to have min in the subscript.

1 Like

That’s … kind of bad.

I strongly agree with your option B.

In my field, formulas with indices, including superscripts, subscripts, and multi-indices, are far more common than functions used as exponents or written as subscripts. The current difference between f_1(x) and f_i(x) seems strange to me.

6 Likes

Oh, that’s even better than I first thought. So I can write the Levi-Civita tensor as ϵ_ijkl instead of the current ϵ_(i j k l).

I think there’s a misunderstanding here. e^abs(x) would render as shown in the image because it displays (in monospace) the name of the abs function (and this would result in a warning). Meanwhile, ϵ_ijkl would continue to give unknown variable: ijkl

1 Like

Ah, thank you for clearing this up.

I’m sure there are lots of social scientists around, but I want to add one perceptive from that space. We often work with students (and sometimes coauthors) with limited mathematical and/or programming background. In those cases, being predictable is one of the best tools for teaching a new language, whether it’s R or LaTeX or Typst or otherwise.

I’ve poked around quite a bit with the existing syntax and have enjoyed it, but I think the new, potential option B syntax is more predictable and thus easier to use/teach in the long term. The current syntax is fairly intuitive once you know how it works. However, I think the original and potential B syntax is better. It’s much clearer that a (sub/super)script applies to the next group, whether that’s implicitly the next character or explicitly a group of them. That’s just easier to teach and bugfix, which makes expanding the userbase for semi-technical people a much friendlier endeavor.

8 Likes

Am I correct in understanding that this is in service of preserving the implicit form of sub- and super-scripts? E.g., f_i(x) is ambiguous because it is unclear if this should instead be f_(i(x)) or f_(i)(x)?

This seems to confuse lexical and mathematical ambiguity, to me. I think all uses of sub- and super-scripts should require explicit delimiters. I.e., f_i(x) should throw an error because this is ambiguous lexically, even if it is much less ambiguous in the mathematical sense (where we all have been conditioned to expect this to be interpreted as f_(i)(x)). Using parentheses to explicitly delimit an implied call to #sub[] seems more consistent, no?

2 Likes

I will also add (I am slowly reading the blog post between meetings – apologies if I have missed something) that e^abs(x) is not unambiguously e^(abs(x)) because (e^abs)(x) is a perfectly cromulent way to lexically define a function which computes first the abs of x and then the exponential. Mathematically, these coincide, but lexically these are distinct. Author intention can be determined by explicit delimiters.

@Christopher_Marcotte note that abs is a function, so by itself (e^abs) it will produce an ugly monospace identifier of itself: e^absbs

it’s very unlikely that a user wants this, so it makes sense to “unambiguously” interpret a function identifier followed by parentheses as a function call (which produces e^abs(x)).

the way to typeset e^"abs" (x) is with e^"abs" (x) (and without the space after this update)

2 Likes

I think all uses of sub- and super-scripts should require explicit delimiters.

Hi Christopher! I don’t quite agree with you. Your proposal might be more suitable for devleloping programs or formalizing proofs (e.g., https://lean-lang.org), but not for writing documents. For typst, I think it’s acceptable to adhere to a handy convention even if it’s kind of ambiguous. It’s just like in everyday math, lim_(n→+∞) impiles n ∈ ℕ, while lim_(x→+∞) impiles x ∈ ℝ.

Ah, thank you for clarifying. I misunderstood how this worked in the example above.

1 Like

From reading the blogpost I did not really understand the sin case and thus not the arguments based on it. Why should sin not be a function? What makes it different than abs?
Also: What is so problematic about letting the user input, whether a thing he defined is a function or a symbol? (Even though I believe symbol would be a good default.)

when you write $abs(x)$ it is displayed as |x|. That is because math.abs is a Typst function that implements this behavior. when you write $sin(x)$ instead, you get sin(x), because sin is a text operator that is displayed, and the (x) is displayed separately from it. From a Typst execution standpoint, these are very different things, even though written the same way. And since shenanigans like let sin = math.abs are possible, you can’t differentiate the two before it’s too late.

2 Likes