Using a hash map would be the best, but it seems like there is none available right. (Would be asympt O(n) construction)
The algorithmically the next thing maybe should be a tree or something based on binary search. In a previous message on the forum I wrote this binary search function, and we can use it to create a grouping function. (Should be O(n log n) construction)
Note that Typst itself has an open issue for adding an array grouping method. Here’s what I wrote…
#let a = ((1, 1), (1, 2), (1, 1))
/// Get index of partition point
/// - func (function): Return true if less or equal, false if greater than partition point
// based on rust libcore partition point
#let partition-point(range, func) = {
let size = range.len()
if size == 0 {
panic("Empty range in partition-point")
}
let base = 0
while size > 1 {
let half = calc.div-euclid(size, 2)
let mid = base + half
let lte = func(range.at(mid))
base = if lte { mid } else { base }
size -= half
}
let lte = func(range.at(base))
base + int(lte)
}
// Array of data and a function that maps each element to its representative for comparison, the rep must be <= comparable.
#let group(data, pred) = {
// make a grouping, an array of (rep: _, group: (_, _, ...))
let res = ()
for elt in data {
let rep = pred(elt)
let pp = if res.len() > 0 { partition-point(res, elt => elt.rep < rep) }
if pp == none or pp >= res.len() {
res.push((rep: rep, group: (elt,)))
} else if res.at(pp).rep == rep {
res.at(pp).group.push(elt)
} else {
res.insert(pp, (rep: rep, group: (elt,)))
}
}
res
}
#group(a, elt => elt)
//freq
#group(a, elt => elt).map(e => (e.rep, e.group.len()))
#group((1,2,1,3,4,3,7), elt => elt)
#group((), panic)
It seems to work (hehe). There’s also a package funarray that implements a grouping function.
Algorithmic overview. Rep. = element representative
- Hash map/dict, O(n). We could achieve a true hash map this with rep to string and rep equality (string for dict slot but no need to assume that implies rep equality)
- Binary search array, needs rep ordering and equality., O(n log n)
- linear search with position, just needs rep equality, O(n²)