Started a small(?) project - recreating a 1938 book in pdf

The book itself is the 1938, 2 edition of “ספר הבדיחה והחידוד” By Alter Droyanov. This is a collection of 3,170 European Jewish jokes from the 19th century and before.

The first page of the first chapter of the first book is -

and I’m trying to keep the same look.

As I’m not going to re-type the whole book manually, I’ve looked for a shortcut. It was found here - פרויקט בן יהודה. So after downloading the text in several formats (and trying pandoc) I’ve settled on the text format.

The text file is passed through a Python “preprocessor” that -

  • remove superfluous new lines
  • converts “-” used as hyphen with true hebrew hyphen (“סוף-סוף” changed to “סוף־סוף”)
  • corrected quotation marks (“דצ"ך” changed to „דצ”ך”"
  • then created a csv file where the first field is th joke# and the second the joke text.

That csv is fed to the following typst code:

#set text(font: "Drugulin CLM", size: 7pt, lang: "he", dir: rtl)

#import "@preview/marginalia:0.3.1" as marginalia: note
#let note = note.with(side: "left", numbering: none)

#set page(width: 13cm, height: 18.2cm,
  margin: (left: 1in, right: 0.69in, top: 0.75in, bottom: 1in),
  header: counter(footnote).update(0))
//#show: marginalia.show-frame
#show: marginalia.setup.with(clearance: 0pt, 
  outer: (far: 0mm, sep: 0mm, width: 0.69in), inner: (far: 0mm, sep: 0.1in, width: 0.9in))

#let par_split(par) = {
  let first = h(0.7em) + text(size: 1.4em, par.first())
  let par_1 = par.clusters()
  let rest = par.clusters().slice(1, par_1.len()).join()
  return (first, rest)
}
#let asterisk(.., last) = "*" * last
#set footnote(numbering: asterisk)

#let test = csv(delimiter: "\u{001f}","chapters/chapter_b01c01.csv")

#for i in range(0,test.len()) [
  #note(test.at(i).at(0))
  #let par = test.at(i).at(1)
  #let (first, rest) = par_split(par)
  #first#eval(rest, mode: "markup")\
  ]

(code is still being created, so … be gentle).

The result (of the first few jokes) is:

main.pdf (74.3 KB)

Hope this interest someone.

Problems (with the source text till now)

  • not all the Vowelization (ניקוד) that are in the book are in the source text.
  • footnotes are just a simple number.
  • (that’s what I’ve found by now)

This need to be corrected and entered in the source file and then reprocessed.

And that it. I’ve no prior Python or Typst experience (but I’m learning and claude did help with the python code in the beginning and with the regex).

Please feel free to comment and if you have an idea - you’re welcome!

3 Likes

It’s in markup mode so use typ instead of typc. See https://forum.typst.app/t/11#p-29-body-3.

Thx. Done.

1 Like

Another 2 general recommendations would be to see:

Hi
I am interested in seeing this and maybe being in direct contact.
I am helping my son (the author) produce output in both PDF and e-PUB from Hebrew text input in MD format.
Also interested in seeing your input text file.
I don’t mind sharing with you the scripts that I am using.
David

Hi. Sent you a message.
Ben