I am trying to produce a list of terms mentioned in a text. Each term can be marked, as #x[term], and the list should preserve the formatting of the term (regular, italic, bold, …). Terms may be single words or phrases. I have found three ways to do this, each with slightly different results. The typst code below demonstrates the three mechanisms I have composed.
My questions:
- Is there a better way to dynamically build an array or dictionary than using
state? - Is there a better way to extract the string value than the
content-to-stringfunction used here? - Are there more efficient ways to produce any of these variants?
#let collectedwords = state("unique_name", ())
#let content-to-string(content) = {
if content.has("text") {
lower(content.text)
} else if content.has("children") {
content.children.map(content-to-string).join("")
} else if content.has("body") {
content-to-string(content.body)
} else if content == [ ] {
" "
}
}
#let x(word) = {
word
collectedwords.update(
wd => {
wd.push((content-to-string(word), word))
return wd// return updated state
}
)
}
#let showwords1() = context {// Array method
let words = ()
for (word) in collectedwords.get() {
words.push(word)
}
return words.slice(2,)
.sorted(key: k => k.at(0))
.map(((a,b)) => (b))
.join(", ", last: ", and ")
}
#let showwords2() = context {// Dedup array mtethod
let words = ()
for (word) in collectedwords.get() {
words.push(word)
}
return words.slice(2,)
.dedup(key: ((a,b)) => a)
.sorted(key: k => k.at(0))
.map(((a,b)) => (b))
.join(", ", last: ", and ")
}
#let showwords3() = context {// Dictionary method
let words = ()
for (word) in collectedwords.get() {
words.push(word)
}
return words
.slice(2,)
.sorted(key: k => k.at(0))
.to-dict().values()
.join(", ", last: ", and ")
}
= Mechanism to list marked words in a text
The marked words in order of occurence in the text: #x[def], #x[_def_], #x[ghi], #x[ghi], #x[_jkl_], #x[मुदिता], #x[abc], and #x[Abc]. (In use they would be scattered throughout a larger text.) These routines each produce a sorted list of the words. Each variant has different characteristics.
Sorted lists of marked words:
- Array method: #showwords1().
- Dedup array method: #showwords2().
- Dictionary method: #showwords3().
Differences:
- The array method preserves the order in which spelling duplicates (#underline[Abc] and #underline[abc], here) are found.
- The dedup method shows the first occurrence of each unique spelling
- The dictionary method shows the last occurence of each unique spelling.