Can we (re)define symbols as functions?

I am playing with using Typst for typesetting a kind of CJK-Braille, and in this dialect, there generally are no spaces between characters nor words. A sample of the “text” is as below:

⠾⠃⠚⠔⠏⠲⠋⠓⠓⠛⠄⠤⠚⠔⠏⠲⠎⠑⠁⠉⠻⠐⠤⠚⠒⠂⠋⠓⠓⠛⠄⠺⠕⠄⠍⠫⠄⠓⠻⠄⠥⠂⠎⠒⠎⠸⠈⠾⠡⠈⠢⠀⠼⠁⠊⠉⠁⠀⠝⠲⠄⠤⠾⠃⠭⠚⠁⠇⠽⠠⠞⠽⠎⠧⠈⠅⠃⠘⠺⠃⠂⠅⠃⠓⠰⠎⠧⠈⠇⠕⠄⠉⠧⠂⠭⠩⠉⠊⠁⠿⠀⠉⠣⠂⠓⠰⠎⠧⠈⠇⠕⠄⠞⠗⠞⠎⠭⠟⠓⠃⠂⠤⠾⠃⠎⠲⠓⠡⠂⠎⠑⠁⠇⠽⠠⠀⠣⠚⠑⠂⠓⠰⠄⠜⠘⠣⠋⠦⠎⠃⠎⠴⠎⠫⠄⠜⠀⠞⠛⠁⠚⠔⠓⠩⠂⠇⠾⠄⠋⠓⠓⠛⠄⠾⠩⠄⠭⠣⠄⠞⠗⠚⠡

Every 2-3 Braille symbol forms a CJK character, and several CJK characters form a meaningful word. As an example, ⠋⠓ converts to 飛 and ⠓⠛⠄ is 行. Together 飛行 means to fly.

Line-breaking algorithm wreaks havoc on contiguous Braille, cutting across chars and words at random places. The odds of ⠋⠓⠓⠛⠄ suffering a cut inside is good.

I know that if I were to wrap these into #box[⠋⠓⠓⠛⠄], or perhaps #box[#box[⠋⠓]#box[⠓⠛⠄]], line-breaking would not touch the innards. However, the number of #box makes reading this quite difficult (not that Braille was easy to read by sight at first place).

Is there a way to define symbols as functions? That is, say, [xyz] serving the same purpose as #box[xyz]? Then a sentence would look like [⠾⠃][⠚⠔⠏⠲][⠋⠓⠓⠛⠄][⠤⠚⠔⠏][⠲⠎⠑⠁][⠉⠻⠐][⠤⠚⠒⠂] which is somewhat readable. I am open to any parenthesis which may not be the [] () {} pairs that already carry meaning in Typst (I can map it to some keystroke combination).

Have you tried to use Zero Width Space at the words edges? That is a space that isn’t displayed as one, but it would tell Typst where are the word boundaries to make line breaks. It can be placed like a#(sym.zws)b, or you can make a variable for simplier use like #let wb = sym.zws and then use it with a#(wb)b.

Of course, it is probably kind of possible to do what are you trying to ask for there, but that would require lots of hacking, would be much less error-prone and would be much less convenient to use.

Thank you for the idea. I played around with #wb() but it’s as verbose as #box[].

I had an awful idea that turned out kinda neat :laughing:

#let braille(..content) = {
  for text in content.pos() {
    box[#text]
  }
}

This can then be called with word/chars “visually boxed”, albeit with , as additional separator.

#braille([⠾⠃], [⠚⠔⠏⠲], [⠋⠓⠓⠛⠄], [⠤⠚⠔⠏], [⠲⠎⠑⠁], [⠉⠻⠐], [⠤⠚⠒⠂])

It looks reasonable enough, and actually looked really good when the function got renamed to the truly awful __(). Something about the dots being in completely different geometry to everything else makes it reasonably readable.

#__([⠾⠃], [⠚⠔⠏⠲], [⠋⠓⠓⠛⠄], [⠤⠚⠔⠏], [⠲⠎⠑⠁], [⠉⠻⠐], [⠤⠚⠒⠂])

…actually, there is an even cleaner representation. We can just split on a separator before wrapping each into a box.

#__[⠾⠃|⠚⠔⠏⠲|⠋⠓⠓⠛⠄|⠤⠚⠔⠏|⠲⠎⠑⠁|⠉⠻⠐|⠤⠚⠒⠂]
1 Like

Your solution with using | as a separator is probably the neatest for entering the data.

This, however, stood out to me:

You can give multiple pieces of content to a function with the following syntax which does not need ,.

//This
#braille[⠾⠃][⠚⠔⠏⠲][⠋⠓⠓⠛⠄][⠤⠚⠔⠏][⠲⠎⠑⠁][⠉⠻⠐][⠤⠚⠒⠂]
//is equivalent to
#braille([⠾⠃], [⠚⠔⠏⠲], [⠋⠓⠓⠛⠄], [⠤⠚⠔⠏], [⠲⠎⠑⠁], [⠉⠻⠐], [⠤⠚⠒⠂])
1 Like