Prism language definition for Typst

Mc-Zen · October 8, 2024, 7:06am

Out of need and lack of availability (see here), I’ve created a Prism language definition for Typst.

Mc-Zen/prism-typst: Syntax-highlighting Typst code with Prism (github.com)

It is far from being perfect or complete, so you are invited to make PRs and to add tests cases! I won’t be able to maintain this big-style, so any help is appreciated :)

Mc-Zen · October 8, 2024, 7:10am

You can view a demo here:

Typst/Prism (mc-zen.github.io)

It shows the capabilities as well as some of the limits.

Myriad-Dreamin · October 10, 2024, 7:24pm

I wrote textmate grammar for typst. This is another regex-based grammar. There are also some testcases for the grammar, and you might get interested: tinymist/syntaxes/textmate/tests/unit at main · Myriad-Dreamin/tinymist · GitHub. The testcases were accumulated during development and daily usage in last year.

Mc-Zen · October 10, 2024, 7:52pm

Wow crazy, this is a loot of tests!

I wonder: can the grammar handle arbitary nesting of code/markup mode (this is used in the tinymist extension, right - if so I guess it can handle it fine)? Because in Prism this is impossible I think, because recursive regexes are not supported and there is also a known issue for other Prism definitions.

Myriad-Dreamin · October 11, 2024, 7:25am

textmate is limited as well, but we do have handled arbitrary nesting of code/markup/math mode. The fact of the grammar tells we can parse all typst syntax by sacrificing some corner cases. Among them, the most cursed thing is not nested mode code but accurate switch mode from expression to markup. For example, it cannot handle the following case:

#for idx in range(10) {} {}
// this is code brace ^^ ^^ that is plain text

In the example, whether a { is parsed as a brace is determined by whether it is after a “range expression of the for”. This requires indefinite backtracking if we parse the expression by regex.

But as I have said above, we can sacrifice some accuracy, either parse some text (at the boundary of markup text and code) as code, or skip some code (at the boundary) in the text. We can assume #for cannot be mixed in the markup content, then we can simply handle a for expression by:

for ([^\n\{\[]*|BRACE_EXPR|BRACKET_EXPR)*

It starts to handle a for expression if it sees a for keyword, it handles any content until a new line and will match braces and brackets in pairs in the content.

Myriad-Dreamin · October 11, 2024, 7:28am

Yes, it has been used for a long time.

Mc-Zen · October 11, 2024, 9:38am

Oh yeah, I noticed the same problem when working on the Prism definition.

Mc-Zen · October 11, 2024, 9:39am

Anyway, you’ve done excellent work here! I’ve never noticed any issues with the highlighting actually.

Janosh · February 9, 2025, 7:49pm

@Myriad-Dreamin thanks for building a textmate grammar for Typst syntax highlighting! is there a demo somewhere of how to use it on a website showing Typst code? asking for Scientific Diagrams

Myriad-Dreamin · February 11, 2025, 4:00am

Hi, you could check shiki to use the textmate grammar on webpages.

To make typst syntax highlighting on websites, you could also check:

official, and full fledged highlighting, (sorry I remember it was in typst’s GitHub org but I cannot find it just now)
full fledged highlighting, highlighter.js
simple regex-based syntax highlighting, prism, yeah this post.