How to extract the document structure programmatically?

Another option that has been recently released is typlite, see this post.

1 Like