How to batch compile many small documents into SVGs?

I want to compile 400+ documents into SVGs. Each document is very short. Most of them only have three lines, like the one below. Is there a proper way to do so?

#set page(width: auto, height: auto, margin: 0pt, fill: none)
#let Re = math.op("Re")
$Re u$

(Background: Guide: Render typst math in MkDocs)

My attempts

At present, I pass the document as stdin to the typst executable, and retrieve the SVG from stdout. This approach takes about 3 minutes to compile 400 documents.

from subprocess import run

for doc in documents:
    svg_bytes = run(
        ["typst", "compile", "-", "-", "--format", "svg"],
        input=doc.encode(),
        check=True,
        capture_output=True,
    ).stdout

Putting the call in a ThreadPoolExecutor will cut the time, but still at the minute level, and it makes handling Ctrl-C complicated.

I’ve also tried typst-py. It’s only a few seconds faster than the first approach for the 400 documents.

import typst

for doc in documents:
    svg_bytes = typst.compile(doc.encode(), format="svg")

However, if I join all my short documents into a single long one, then compiling it (with either approach) would take less than a second.
Therefore, I believe these 400+ documents can be compiled in one or two seconds.

Is there a proper way to do so? Joining the documents naively will cause them to interfere with each other (which I don’t want), while #[doc-1] #pagebreak #[doc-2] #pagebreak … looks fragile…

Why? It looks perfect to me: it means “put each document in its own scope and separate documents by pagebreaks”. Isn’t that literally what you want?

Or maybe you have queries in your documents that could make them interfere with each other? In that case indeed it’s fragile and I’m not sure what you could do about it.

1 Like

The drawback lies in error handling. If one of the documents has mismatched [], then it’s hard to tell which one is to be blamed.

Edit: The introspection is indeed a problem, but my small documents are not that complicated.

You could add a comment with the document name before each document, and search for the nearest previous comment of this form when you get an error? Definitely a hack but the best thing I can think of…

1 Like

Hmm, I’m even considering to give up this restriction. Without #[], typst can give better error messages.
The downside is that I can’t cache it anymore.

Is it possible with the typst python package to use incremental compilation? Instantiate the compiler and world, and compile different documents? I don’t see a way to do that, based on a very quick look.

I don’t think so. There’s a typst.Compiler(input, sys_inputs) API for exporting to different formats and querying, but that’s not what we want.

Wait, it might be possible if I pass a file path instead of bytes to typst.Compiler… `Compiler` class doesn't track file changes · Issue #114 · messense/typst-py · GitHub

Update: I made a basic test. Yes it works, and it’s 500× faster (less than half a second).

from pathlib import Path
from time import perf_counter

import typst

f = Path("a.typ")
f.write_text("")  # The file should exist when initializing the compiler

c = typst.Compiler("a.typ")

start = perf_counter()
for n in range(100):
    f.write_text(f"{n}", encoding="utf-8")
    result = c.compile(format="svg")
    print(result)
    print(f"{n + 1:3d}: {perf_counter() - start}")

Relying on the file system to pass data may not be very reliable, but it seems that the OS has successfully handled the race condition.

Besides, it could be faster if typst-py provides an API that can skip file i/o or support unnamed temp files. (although this step is no longer a bottleneck in my system)

1 Like