Best Practices Gathering: Multilingual fonts Global setting

Typst Docs: You should LOCALLY change the text language just for those parts that contains passages in a different language than the main language.

  • Typst is awesome to write English or Latin docs. But when it comes to non-latin, e.g. Greek, Arabic and CJK, things become complex.
  • Compare to local, global setting is a bit more convenient.

All World Languages in One Visualization

What languages use the Latin alphabet:

  • Germanic and Romance languages: English, German, Spanish, French, Italian, Portuguese, Romanian.
  • Slavic languages: Polish, Czech, Slovak, Croatian, Slovene.
  • Asian-Pacific languages: Indonesian, Malay, Tagalog, Vietnamese, Hawaiian, Samoan, Fijian.
  • American-Australian languages: Quechua, Nahuatl, Guarani, Warlpiri, Arrernte.
  • African and other languages: Swahili, Hausa, Yoruba, Igbo, Turkish and Somali.

English and Latin

Greece

Look for a best practice

Arab

Look for a best practice

China

Because pretty long history, there are many kinds of Chinese spoken and written around the world. But most used nowadays are Simplified and Traditional.

Simplified (Typst score: 68.3)

Chinese is a ltr and mono language, you care about punctuation marks only. I’m NOT a China senator or ISO expert, but here are some formal documents I searched out:

  1. <ISO 7098:2015 Chinese Romanization>
  • International Standard, reviewed in 2021
  • characters
  • Pinyin
  • punctuation
    • marks similar to those in Latin are transcribed as their Latin counterparts
    • specific marks are transcribed as follows: 0x3002 → 0x2e, 0x3001 → 0x2c, 0x2022 → 0x20, 0x2026+0x2026 → 0x2026 (.<-。,<-、…<-…… space<-·)
  1. <PRC National General Language Characters Law>
  1. <GB/T 15834-2011 General Rules for Punctuation>
  • National Standard Recommended(non-forced), 2012
  • punctuation, 29
    • 0.5em, single, 3: -·/
    • 1em, single, 21: 。,、:;?!’‘“”([❲【《<>》】❳])—~underPoint underLine「『』」
    • 1em, double, 4(1): ??!!?!!?
    • 2em, triple, 2(1): ???!!!
    • 2em, single, 2: ——……
    • 4em, double, 1: …………
    • order: 一、(一)1.(1)circleOne or a.
  1. <CY/T 154-2017 Rules for editing Chinese publications interpolated with English>
  • National Press and Publication Standard Recommended(non-forced), 2018
  • punctuation
    • Chinese sentence: end with Chinese marks, e.g. 。?!
    • English sub-sentence: put in Chinese quote marks
    • marks inside English sentence: use English marks
    • /: add spaces if need
    • font size: Chinese small 5 + English 9P, 5 + 10.5P
  1. Translation of <ISO 7098:1982 Draft>
  • Ministry Web page, outdated perhaps, 2009
  • punctuation
    • similar transcribed as Roman counterparts
    • under circle point → italic
    • under wavy line → italic

As you can see, no forced law on Simplified Chinese (SC) punctuation marks. And GB or CY doc made some typesetting errors, i.e. they each violated themself.

Merriam-Webster joke: Is themself a word?

What’s the status of Typst SC typesetting?

  • *0.99: arbitrary font and glyph fallback if not set
  • -2: no under circle point and under wavy line
  • -5: wrong marks width: -·/❲?!❳
  • -2: smartquote width or height fails
  • $“score” = 100*0.99*(29-9)/29 = 68.3$

Best Practice:

#set text(10pt)

Libertinus _Serif_ *En:* a,b.c?d!e'f'g"h"i\`j\`k\~l\@m\#n\$o%p^q&J\*K-L\_M=N+O\\P|Q(R[S{T\<U\>V}W]X)Y/Z;

无豆腐_宋体_*简中:*〇-一·二/三。两,仨、四;五?六!七“八‘九’十”廿(拾[玖❲捌【柒《陆「伍『肆』叁」贰》壹】零❳字]词)句—段~节??章!!篇!?书?!册???典!!!籍——古……诗…………文“出游从容,‘鱼之乐’也。”惠子曰:“How to tell? "After all, you are not 'a fish'".”。```pl print "花径不曾缘客扫"  #春风不度 ``` $e^(i pi) + 1 = 0 "玉门关"$

#block(spacing:0.6em, line(length:100%))

#set text(lang:"zh", region:"CN", font:(
  (name:"Libertinus Serif", covers:"latin-in-cjk"),
  "Noto Serif CJK SC"))
#show smartquote: set text(font:"Libertinus Serif")
#show raw: set text(font:(
  (name:"DejaVu Sans Mono", covers:"latin-in-cjk"),
  "Noto Sans Mono CJK SC"))
#show math.equation: set text(font:(
  "New Computer Modern", "Noto Serif CJK SC"))
#show emph: it => box(skew(ax:-12deg, it))

Libertinus _Serif_ *En:* a,b.c?d!e'f'g"h"i\`j\`k\~l\@m\#n\$o%p^q&J\*K-L\_M=N+O\\P|Q(R[S{T\<U\>V}W]X)Y/Z;

无豆腐_宋体_*简中:*〇-一·二/三。两,仨、四;五?六!七“八‘九’十”廿(拾[玖❲捌【柒《陆「伍『肆』叁」贰》壹】零❳字]词)句—段~节??章!!篇!?书?!册???典!!!籍——古……诗…………文“出游从容,‘鱼之乐’也。”惠子曰:“How to tell? "After all, you are not 'a fish'".”。```pl print "花径不曾缘客扫"  #春风不度 ``` $e^(i pi) + 1 = 0 "玉门关"$

Traditional

Look for a best practice

Japan

Look for a best practice

Korea

Look for a best practice

For Chinese, I’ve made a similar attempt here:

https://ydx-2147483647.github.io/typst-set-font/

Edit: See also https://gap.zhtyp.art

1 Like

Thx for the GitHub repo! Most are the same, we share matters of common concern.

By the way, do u have any ideas

  • to check out enforced formal documents about Chinese typesetting?
  • to do underPoint or underWave to emph(text)?