How can one make the Libertinus Serif Th ligature searchable in PDF output?

Alternate title: How can one disable a particular ligature in a multi-ligature feature?

There is a particular ligature, Th, that I need in reproducing a text. This can be enabled in Libertinus Serif with the following:

#show "Th": [T#sym.zwj;h]

(It can also be enabled by enabling discretionary ligatures, but that brings in other problem, read on.)

Unfortunately, the typst-generated ligature is not mapped to “Th” and thus is not searchable as the letter sequence “Th” in documents – it copies as “T<200d>h” where “<200d>” is the Unicode zero-width joiner, as the show rule creates it.

How can the Th ligature as generated by this mechanism be made searchable? Failing that, can the ligature be created by another mechanism, as in adding an OTF feature?

(In ConTeXt I use the following code to enable the ligature as a custom OpenType feature:

\startluacode
  fonts.handlers.otf.addfeature {
  name = "thlig",
  type = "ligature",
  data = {
    ['T_h'] = { "T", "h" },
  }
}
\stopluacode

Similar capabilities are available in other TeX regimes that support LuaTeX. The ability to define custom OTF features was rejected in typst issue 2489.)

This was addressed under issue 479 when the Th ligature was supported in LinLibertine as a standard ligature, but Libertinus Serif since v6.7 places the Th ligature under dlig, which brings in other (unwanted) ligatures.

As a work-around, I have considered enabling dlig and using show rules to decompose the unwanted ck, cz, tt, and tz ligatures, but so far have not been able to make that work. My effort to date on that front:

#set page(width: 5cm)
#set text(
  font: "Libertinus Serif",
  hyphenate: true,
  lang: "en",
)

#let testline = [Th battery battery battery ch ck tt.]

#[
  == Baseline settings
  #testline

  _Th, ch, ck, and tt are found in PDF search and battery
  is hyphenated._
]

#[
  #show "Th": [T#sym.zwj;h]
  == Th forced
  #testline

  _Th ligature is not found, ch, ck, and tt are found and
  battery is hyphenated._
]

#[
  #set text(discretionary-ligatures: true)
  == Dlig enabled
  #testline

  _Th, ch, ck, and tt ligatures are found in PDF search
  and battery is hyphenated._
]

#[
  #set text(discretionary-ligatures: true)
  #show "tt": {"t"+sym.zwnj+"t"}
  #show "ch": [c#sym.zwnj;h]
  #show "ck": {"c"+sym.zwnj+"k"}
  == Dlig with disabled ch ck tt
  #testline

  _Th ligature is found in PDF search, but ch, ck, and tt
  are not found and battery is not hyphenated._
]

Perhaps there is another way to disable these.

From a quick test, I think this works:

#show "Th": set text(discretionary-ligatures: true)
1 Like

Perfect! Better than what I was thinking of because it has fewer side effects as well. (It should only effect text in fonts with that particular ligature in the dlig set, where my first attempt spilled over into other fonts.)

Thank you.