Why are table rows not represented as arrays?

I have a very basic question today about the table element: why are we expected to give a single list of cells to represent a two-dimensional table, instead of passing a list of arrays, each representing one row?

For example, in the table guide, instead of

#table(
  columns: 2,
  [*Amount*], [*Ingredient*],
  [360g], [Baking flour],
  [250g], [Butter (room temp.)],
  [150g], [Brown sugar],
  [100g], [Cane sugar],
  [100g], [70% cocoa chocolate],
  [100g], [35-40% cocoa chocolate],
  [2], [Eggs],
  [Pinch], [Salt],
  [Drizzle], [Vanilla extract],
)

I would naturally have expected something like:

#table(
  ([*Amount*], [*Ingredient*]),
  ([360g], [Baking flour]),
  ([250g], [Butter (room temp.)]),
  ([150g], [Brown sugar]),
  ([100g], [Cane sugar]),
  ([100g], [70% cocoa chocolate]),
  ([100g], [35-40% cocoa chocolate]),
  ([2], [Eggs]),
  ([Pinch], [Salt]),
  ([Drizzle], [Vanilla extract]),
)

(Or maybe #row(foo, bar) instead of just (foo, bar).)

This is closer to the way we think about tables, it looks (from an uninformed distance) easier to script with, and it would also be sensibly be more robust to the deletion or addition of a cell somewhere: instead of having the whole table reflow in a weird way if you add or delete a single cell, you could get a partial row or an error message.

(In general I would expect the table to be structured as a sequence of horizontal elements, which are not necessarily only rows, for example there could be horizontal lines or formatting data in the middle.)

The current unstructured choice looks odd at first glance, but it is not explained/justified/discussed at all in the table documentation. Is there a deep reason why we want table in this unstructured format? Is this just an error of youth that is too hard to change now? Do people recommend using another table library in practice?

1 Like

For reference, design over the table interface is also discussed here: A simpler way to input tables and grids content · Issue #4071 · typst/typst · GitHub

There is some discussion around adding table.row. I think it could be helpful in the long run to have a good human-friendly interface, eg table.header, table.row, table.column, table.footer, etc., but it does necessitate a lot of work.

I don’t think it’s necessarily a mistake. A lot of Typst design decisions are intentionally made in order to enable the most freedom. Typst is still in its development phase, and it makes sense to have the most generic interface possible. If the interface were limited to a row-based input, it would be much harder to code complicated tables.

That’s exactly what was avoided, making an assumption! Instead, table provides a pretty good customization potential.

I think Typst as it is was never meant to be used by end users. The current state of Typst is perfect for package developers, but not for people who just want to write papers for example.

Most of tablex’s features are ported back into Typst at this point, so I don’t think it makes sense to use another package, except for specific use cases like importing from an Excel file.

To be completely fair, I don’t think I’ve ever written a good-looking LaTeX table without copying examples from other papers, and code from various websites. If I had spent that time going through the docs, I would have likely spent hours just trying to find something. Typst docs are pretty concise and describe pretty much everything available without having to go through n-th random CTAN packages for that one feature you want, but I’m disgressing.

2 Likes

I agree with @quachpas. You can do a lot of stuff with the current implementation and I think is what makes it so great to use. I had to get used to its top-left to bottom-right filling concept, but it’s a good flexible structure.

One thing I’d like to see is more on a formatter approach. As in the formatter takes care of correctly formatting the table in the source code (for example with a 3 column table: insert linebreaks after every third cell). typstyle does this with some limitations.

#table(
  columns: 3,
  [asdf],[hello],[pepperoni],[asdf],[hello],[pepperoni],[asdf],[hello],[pepperoni],
)

// more readable structure:
#table(
  columns: 3,
  [asdf],[hello],[pepperoni],
  [asdf],[hello],[pepperoni],
  [asdf],[hello],[pepperoni],
)

Where some improvements could be made are how colspan and rowspan are handled. I like the LaTeX approach of having to insert empty cells (which get overridden by the colspan and rowspan functions) allows for a much more readable source code!

1 Like

In my opinion, forcing users to input rows instead of cells into tables would be really annoying, especially when considering that you might want to change the amount of columns later. In your proposed “row-wise” insertion system, you’d need to go through every row and add/remove entries. To be fair, maybe you’d usually need to do so anyways with the current cell-wise insertion, but that’s not always true.

There are also already ways of conditionally styling cells based on the row/column, so that further makes row-wise insertion redundant.

Lastly, I also just think that if you want to think of insertion as row-wise, you can just style your code to match that way of thinking. I think your first example already “feels” like row-wise insertion (even though it’s not), because each row of code corresponds to a row in the table.

If you really wanted to have row-wise insertion, why not do so through a custom function? Something like

#let mytable(..args) = {
  let p = args.pos()   // table rows
  let n = args.named() // styling etc
  let cols = n.at("columns", default: 1)
  if type(cols) == array {
    cols = cols.len()
  }
  assert(
    p.all(i => i.len() == cols),
    message: "all rows must match column size"
  )
  table(
    ..n,
    ..p.flatten()
  )
}

#mytable(
  columns: 2,
  ([*Amount*], [*Ingredient*]),
  ([360g], [Baking flour]),
  ([250g], [Butter (room temp.)]),
  ([150g], [Brown sugar]),
  ([100g], [Cane sugar]),
  ([100g], [70% cocoa chocolate]),
  ([100g], [35-40% cocoa chocolate]),
  ([2], [Eggs]),
  ([Pinch], [Salt]),
  ([Drizzle], [Vanilla extract]),
)

does just that.

ETA: This code does not produce the expected behavior if some of your cells span multiple rows or columns

1 Like

I don’t understand the replies saying that the current approach is more flexible. It’s not! You can collapse the rows of the table to get a flat list of cells as we have today, so everything you can do with the current approach could be done with a more structured input – just erase the structure.

Having rows appear as explicit structure does not prevent customization, such as colspan to have certain cells span several columns. (It would make it harder to try to fit a single cell across two rows, but I doubt this is supported today.)

But it does make the document more robust (and probably easier to style) by forcing users to be explicit about where the end of each row is in the list. This means that local changes and intermediate states (for example: we just increased the number of columns, but haven’t filled all the missing columns yet) would behave in a more predictable way, closer to what people would expect, and be easier to debug.

1 Like

Sure, but my naive question is why this is not the default – whether it’s an intentional choice, an old mistake, something.

In programming languages, it has been standard practices for decades now to represent 2-dimensional arrays as arrays of arrays, we write a[i][j] to access the cell at position (i, j), etc. Older programming languages would represent them as flat arrays, to be accessed as a[i*width + j], and people realized that this is less ergonomic because it does not match the data structure, introduces various opportunities for annoying mistakes, etc. Typst has a type of arrays, and generally relies on its richer data structure for domain modeling, so it would naively seem very natural to organize table cells into an array of arrays, and I wonder why this choice was not made (and think that it would be worth documenting in the table documentation).

For me it is! Additionally, I’m not forced to use arrays representing rows, but simply work with the cells themselves. The explicit structure can be created, which also the guide applies: separating the rows with linebreaks.


I don’t know the reason, why the devs did it the way they did. It could simply be to keep it consistent with the Typst style…

But I think it’s a very good boilerplate, which allows expanding the functionality with packages. In theory you could probably also build a function which supports LaTeX tables.

If this structure really bothers you (for now, as Typst is still in development), the approach from @aarnent is pretty nice!

It is supported with the rowspan option. There’s an example here. So maybe that’s one reason to pass all cells as a simple list… I’m not sure rowspan would be more difficult with cells passed as an array of rows, but it would at least be a bit weirder.

Maybe a bigger reason is simply verbosity. Typst aims for a lightweight markup language, and tables are already a bit verbose (compare e.g. with simple Markdown tables.) Having to type

#table(
  ([*Amount*], [*Ingredient*]),
  ([360g], [Baking flour]),
  ([250g], [Butter (room temp.)]),
  ([150g], [Brown sugar]),
  ([100g], [Cane sugar]),
  ([100g], [70% cocoa chocolate]),
  ([100g], [35-40% cocoa chocolate]),
  ([2], [Eggs]),
  ([Pinch], [Salt]),
  ([Drizzle], [Vanilla extract]),
)

instead of

#table(
  columns: 2,
  [*Amount*], [*Ingredient*],
  [360g], [Baking flour],
  [250g], [Butter (room temp.)],
  [150g], [Brown sugar],
  [100g], [Cane sugar],
  [100g], [70% cocoa chocolate],
  [100g], [35-40% cocoa chocolate],
  [2], [Eggs],
  [Pinch], [Salt],
  [Drizzle], [Vanilla extract],
)

is significantly worse.

Another related point is that a flat list of cells plays well with the f[arg1][...][argn] syntax. You can write

#table(columns: 2)[Cell 1][Cell2][Cell 3][Cell 4]

or the multiline version:

#table(columns: 2)[
  Cell 1
][
  Cell 2
][
  Cell 3
][
  Cell 4
]

which might make sense when cells contain a lot of markup…

1 Like

Maybe I’m misunderstanding but having a cell take up multiple rows is possible:

Summary
#table(
  columns: 2,
  table.cell(rowspan: 2, [Foo]), [Bar],
  [Baz]
)

image

And to throw in my two cents, I have been in the situation where I added a column or two to an existing table and then must go back and fix my table. It’s tedious and I know it will happen again in the future. But even so, I don’t think the way it’s currently done is so bad. Requiring explicit arrays (..., ...) I would also find annoying because of all the (s to type.
They both could work technically, they both create annoyances.

What I had in mind regarding “across two rows” is a cell that starts at the end of one row, and continues at the beginning of the next row – different from rowspan where the cell is in one contiguous column. But in any case, arbitrary combinations could be supported as long as the layout logic is willing to do the work.

I went back to this reference (thanks!) and I found that the original poster had articulated rather clearly the main problem with the current approach: it is more fragile than having explicit markers for the end of line/row. They wrote:

No explicit row separator: Without a specific row separator, cells are placed based on the previously defined number of columns, making it difficult to accurately differentiate between rows. This issue becomes particularly problematic when cells are removed or replaced, often resulting in a disorganized table structure during editing. As a result, quick compilation loses its effectiveness because cells shift to different positions, causing confusion during the editing process.

2 Likes

It’s common, but I wouldn’t say standard. You can equally well find fully 2D arrays, indexed as a[i, j], which is not just syntax sugar, that is, a[i] would not be a row. (In fact, arrays of arrays are a bit problematic, since each row isn’t inherently equal-length, unlike an actual 2D array.)

Drawing inspiration from those languages, one could imagine something like this syntax (note semicolons):

#table(
  [*Amount*], [*Ingredient*];
  [360g], [Baking flour];
  [250g], [Butter (room temp.)];
  [150g], [Brown sugar];
  [100g], [Cane sugar];
  [100g], [70% cocoa chocolate];
  [100g], [35-40% cocoa chocolate];
  [2], [Eggs];
  [Pinch], [Salt];
  [Drizzle], [Vanilla extract];
)
1 Like

A small meta point: my original question is of the form “API design choice X is weird, do you know if this is an intentional design decision or just a historic choice, would it be worth mentioning this in the documentation?”, and many of the answers are of the form “actually I like X just fine” or “I don’t have the answer to your question but I would guess that the justification for X is Y”, none of which are answers to my question.

There is nothing deeply wrong with these answers and I appreciate that they come from people with Typst expertise that can share nice insights, but it’s also not a very productive way to ask about Typst design decisions. I guess that maybe the forum is not the best place for this question and I should have asked on github directly?

1 Like

This is actually fairly nice, and I would welcome a small library that offers this facility. But I think that ideally I would want the following features:

  • columns does not use 1 as a default value, but the width of the widest row
  • if columns is not passed, the inferred default is used
  • if a row has missing columns, instead of an error there is a warning, and that row is padded with empty cells
  • colspan is supported (which probably requires computing the length of rows manually)

(Note: this could be a #row(...) function instead of a base array, either way would be fine.)

Is it something you would possibly be interested in contributing? (I might work on this myself otherwise.)

Regarding the meta point, I understand what you mean but also consider that this forum (especially the Questions category) is more community-driven, and most answers will not be from the typst developer team, and instead will be from other end-users. Github is a better place for discussing things like syntax/feature requests, and you are much more likely to have your post seen by a developer.

re: package authorship, please PM me if this is something you are interested in collaborating on

This design choice goes all the way back to 2021, specifically to the commit that introduced grid layout. The same commit introduced the auto keyword. I can’t remember at all what the design considerations were at the time.

1 Like