Testing a parametrisable Typst document

Dear Community,

I am maintaining a rather large parametrisable Typst Document, that originally came from a LaTeX base, which was a pain in the ass, as LaTeX does not seem to be meant for something like this.
We have moved our workflow over to Typst this year and using it is big fun now.

Now, as a compile run in Typst does not take a minute, as in LaTeX, I had the idea to realize a tester script in Python.

Main Specs:

  • Builds a grid of parameter combinatoric sets using itertools
  • Filters forbidden sets
  • Runs all parameter sets (partly in parallel) using concurrent futures
  • uses Typst CLI call
  • parameter injection using --input and a special parameter file
  • compiled result (pdf) is not checked for content, but erased after the run
  • Compile result (CLI Error returncode) is checked
    if it is equal 0 (success):
    • discard the result, count as success
      if it is unequal 0 (error):
    • record the returncode, stdout and stderr texts and the injected parameter set for later analysis and bugfixing
  • run a bit of statistics over the tests
  • After the test finshed or a test is aborted, store the statistics and all the recorded failed test runs with their configuration set to a log file (JSON format)

I run this test every time before a release of the template document and I have already found a lot of failures in certain sets or just some configurations of the document I were not writing code in.

However, you might quickly realise that the amount of sets is quickly increasing with the number of parameters (there are ~30 of them). In the beginning, I had millions of sets to test and this would have taken months. I have now split the test runs into two separate runs with ~30k to 50k runs in order to reduce the compile count and therefore time. Now, it takes a couple of hours (can run over night), paid by the fact that not all possible combinations are tested, but I can live with that.

I wonder if there is a better solution to do this same thing?
Does anyone have a good idea or is this already as good as it can become?

I thought about if using the watch infrastructure of the CLI compiler could offer any advantage in speed?
This way, I could not inject my parameters using --input right? I would need to edit my .typ file programmatically wait for the compile run / error code and text (is there any with watch?) and then run the next set.

That sounds fun.

If you have such a big configuration space, then adopt more software development practices to handle it. Unit tests, integration tests. Try to decouple different aspects, so that it makes sense to test things independently of each other. It is a strength to test every possible combination, but combinatorics dictates that it quickly becomes impossible.

Here’s an idea, don’t know if it will work:

Using typst watch should be an improvement since you get to use incremental compilation this way.

  • typst watches input files, so you don’t necessarily need to edit the .typ file. Take a sys.input that configures the document to read its inputs from a json or toml file; when this input file changes, typst watch will also recompile the document
  • Take care so that typst won’t try to read a half configuration file. Writing a temporary new file; fdatasyncing it and then move-overwriting the input file is a regular approach to a kind of atomic update of a file.
  • I’d restart typst watch regularly if it’s too long running, to save memory.
  • I don’t know how you get the success status from typst watch. If this is not possible, a custom-built program that uses typst libraries might be able to do it.

If typst watch cannot compile a document then it does not generate an output file. So if the output is deleted before changing input parameters, the output file itself can be used as a “success” indicator.

This assumes that a bad set of input parameters causes the compilation to fail.

To detect a fail case might be more difficult. A simple timeout between a change and the output appearing could be used but could waste time or give false negatives. If the stdout of the call to typst watch can be monitored, that would likely give the best results.
For instance here’s the stdout for a file designed not to compile:

watching temp.typ
writing to temp.pdf

[15:36:20] compiled with errors

error: assertion failed: These aren't equal
  ┌─ \\?\C:\Users\xxx\temp.typ:1:1
  │
1 │ #assert(1 == 2, message: "These aren't equal")
  │  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^