It is far from being perfect or complete, so you are invited to make PRs and to add tests cases! I won’t be able to maintain this big-style, so any help is appreciated :)
I wonder: can the grammar handle arbitary nesting of code/markup mode (this is used in the tinymist extension, right - if so I guess it can handle it fine)? Because in Prism this is impossible I think, because recursive regexes are not supported and there is also a known issue for other Prism definitions.
textmate is limited as well, but we do have handled arbitrary nesting of code/markup/math mode. The fact of the grammar tells we can parse all typst syntax by sacrificing some corner cases. Among them, the most cursed thing is not nested mode code but accurate switch mode from expression to markup. For example, it cannot handle the following case:
#for idx in range(10) {} {}
// this is code brace ^^ ^^ that is plain text
In the example, whether a { is parsed as a brace is determined by whether it is after a “range expression of the for”. This requires indefinite backtracking if we parse the expression by regex.
But as I have said above, we can sacrifice some accuracy, either parse some text (at the boundary of markup text and code) as code, or skip some code (at the boundary) in the text. We can assume #for cannot be mixed in the markup content, then we can simply handle a for expression by:
for ([^\n\{\[]*|BRACE_EXPR|BRACKET_EXPR)*
It starts to handle a for expression if it sees a for keyword, it handles any content until a new line and will match braces and brackets in pairs in the content.