Given a Typst file as a string and an index into that string, is there a good way to programmatically (in any language) identify whether it’s in markup, code, or math mode?
let s = "a $ #{ b + 1 } c $"
mode(s, 0) // "a" --> "markup"
mode(s, 7) // "b" --> "code"
mode(s, 15) // "c" --> "math"
Possible solutions
Write a custom parser that only handles transitions between modes
Use an existing grammar, such as tree-sitter-typst
Fork the Typst parser
???
for reference, this is how tinymist does it (I presume)
math,
comment,
markup,
code,
string: stringContent,
raw,
rest,
} = edit.newText;
const newText = kind === "by-mode" ? rest || "" : "";
const res = await vscode.commands.executeCommand<
[{ mode: "math" | "markup" | "code" | "comment" | "string" | "raw" }]
>("tinymist.interactCodeContext", {
textDocument: {
uri: activeDocument.uri.toString(),
},
query: [
{
kind: "modeAt",
position: {
line: selectionStart.line,
Alternatively, you can also take a loot at Is there a prism language definition for Typst? , but there are issues with nested scopes.
2 Likes
tinimist is a great lead, thank you! It looks like it locates the leaf node of the Typst syntax tree , matches the node kind on whether it has a known mode , and traverses the parent nodes until finding a match.
I’m still curious if there are other implementations out there, but it seems like this one is both well-done, widely used, and will be kept updated.
I think a full parse is necessary to identify the current mode with certainty. No shortcuts here.
1 Like