Notes

April 11, 2026

Past writing, in reverse chronological order:

Mathematical Patterns in Comedy: An Analysis of Dante's Divine Comedy and Homological Groups

It is a well-known fact that comedy has structure. What is less appreciated is that the structure of comedy is, in a precise sense, homological.

Consider Dante's Divine Comedy. The poem is partitioned into three cantiche (Inferno, Purgatorio, Paradiso), each containing 33 cantos, plus one introductory canto, for a total of 100. Each canto is written in terza rima, a rhyme scheme of interlocking tercets: ABA BCB CDC, and so on. The chain of rhymes is exact in the algebraic sense: each tercet's boundary maps onto the next, and the composition of any two consecutive boundary maps is zero. Dante, without knowing it, wrote a chain complex.

Let us make this precise.

Definition. Let $C_n$ denote the free abelian group generated by the tercets in Canto $n$. Define the boundary operator $\partial_n: C_n \to C_{n-1}$ by the rhyme map, which sends each tercet to the tercet it interlocks with in the preceding canto. Then the sequence

$$\cdots \xrightarrow{\partial_{n+1}} C_n \xrightarrow{\partial_n} C_{n-1} \xrightarrow{\partial_{n-1}} \cdots$$

is a chain complex, since the composition $\partial_{n-1} \circ \partial_n = 0$. (A rhyme that passes through two boundaries resolves to silence, which is the zero element of comedy.)

Theorem 1. The zeroth homology group $H_0$ of the Divine Comedy is isomorphic to $\mathbb{Z}$, reflecting the fact that the entire poem is path-connected. No matter where you begin — the dark wood, the river Styx, the face of God — there exists a path of tercets connecting you to any other point in the text.

Theorem 2. The first homology group $H_1$ is isomorphic to $\mathbb{Z}^3$, generated by the three independent cycles corresponding to the three cantiche. These cycles are non-trivial: Dante exits each cantica (Inferno ends with "the stars," Purgatorio ends with "the stars," Paradiso ends with "the stars") but the re-entry point is not the exit point, so the cycles do not bound.

This is, I want to emphasize, not a coincidence. The word "comedy" derives from the Greek komos (revel, procession) and oide (song). A procession that returns to its starting point is a cycle. A song with repeating structure is periodic. Comedy, etymologically, is the study of non-trivial cycles in the first homology group.

Corollary. Tragedy, by contrast, is acyclic. Hamlet does not return. King Lear does not return. The chain complex of a tragedy is exact, meaning all homology groups vanish. Tragedy has no topology. This is why it isn't funny.

Remark. One might object that the Divine Comedy is not, by modern standards, funny. This is correct. Dante's idea of comedy was a narrative that begins badly and ends well. The homological interpretation is consistent: a chain complex with non-trivial homology is one where something survives. In tragedy, everything maps to zero. In comedy, some cycles persist. The fact that persistence is not the same as humor is a subtlety that the Italian literary tradition has, in my opinion, handled poorly.

Open question. It remains unknown whether the second homology group $H_2$ of the Divine Comedy is trivial. A non-trivial $H_2$ would imply the existence of a "trapped volume" in the text — a region of meaning enclosed on all sides from which no information escapes. Several Dante scholars have proposed that this region is Canto XXXIII of Paradiso, in which Dante attempts to describe the face of God and fails. The failure of description is, topologically, a void. Whether this void generates $H_2$ is left as an exercise for the reader.


If you have thoughts on the higher homology groups of Italian epic poetry, or if you know a good proof that Boccaccio's Decameron is a simplicial complex, please reach out.


Metastaseis and ruled surfaces

Iannis Xenakis was a Greek-French composer who was also an architect and a mathematician. Before he wrote music full-time, he worked in Le Corbusier's studio designing buildings. In 1954 he wrote an orchestral piece called Metastaseis for 61 musicians where no two players play the same part. The piece is about 8 minutes long and it sounds like nothing that had been composed before it.

The basic idea is that a glissando --- a continuous slide from one pitch to another --- can be represented as a straight line on a graph where the x-axis is time and the y-axis is pitch. One cello sliding from C3 to G5 over four seconds is just a line segment in this plane.

A single glissando as a line on a pitch-time graph

Xenakis realized that if you have 46 string players each performing their own glissando simultaneously, at different rates and in different directions, the collection of lines traces out a ruled surface.

A ruled surface is a surface where through every point, there passes at least one straight line that lies entirely on the surface. The cylinder is one --- you can construct it from a family of vertical lines. The cone is another. The one Xenakis cared about was the hyperbolic paraboloid, which is the saddle-shaped surface defined by

$$ z = \frac{x^2}{a^2} - \frac{y^2}{b^2} $$

A hyperbolic paraboloid is doubly ruled --- there are actually two distinct families of straight lines on it. You can see this if you parametrize it: the surface $z = xy$ contains every line of the form $(t, c, ct)$ and every line of the form $(c, t, ct)$ for constant $c$.

Two families of lines on a hyperbolic paraboloid

In Metastaseis, each straight line is one player's glissando. The opening section has all 46 strings start on the same pitch and fan outward --- some sliding up, some sliding down, at different speeds.

Glissandi fanning out from unison

On the score, this literally looks like the lines generating a cone or paraboloid. Bars 309--314 of the piece draw out a parabola where the glissandi function as tangent lines to the curve.

Xenakis sketched the piece as an architectural blueprint with pitch on one axis, time on the other, and straight lines everywhere. He then translated the geometric sketch into actual notation for 61 players.

A few years later, Xenakis designed the Philips Pavilion for the 1958 Brussels World's Fair while still working in Le Corbusier's studio. The building's shell is a set of hyperbolic paraboloids --- the same ruled surfaces from Metastaseis. The walls of the pavilion are the score of the composition, or the score is the walls, depending on how you want to think about it.

Interestingly, the score's middle section uses a twelve-tone row with durations based on the Fibonacci sequences. Xenakis uses this to generate rhythmic proportions that feel organic and has structure without periodicity.

After Metastaseis, Xenakis composed Pithoprakta (1955--56) using actual probability distributions --- Poisson and Gaussian --- to determine when and how 50 instruments would play, which he calls "stochastic music".


Borges already knew about entropy

In 1941, Jorge Luis Borges (an Argentine poet) wrote a short story called The Library of Babel. In this story, "The universe (which others call the Library) is composed of an indefinite, perhaps infinite number of hexagonal galleries." He introduces a library that contains every single combination of books that can ever possibly exist in the universe. Each book has 410 pages, 40 lines per page, 80 characters per line, sampled from an alphabet of 25 symbols.

The librarians celebrate at first. The library must contain the proof of every conjecture, the biography of every person who will ever live, the cure for every disease. But they quickly realize that it also contains every false proof, every wrong biography, every cure that's actually a poison. And since there's no index, you can't actually tell a true book from a false one without already knowing the truth.

The number of books in this library is $25^{1{,}312{,}000}$. So its not truly infinite (Borges calls it "unlimited and periodic"). Its just $|\Sigma|^n$, all strings of fixed length over a finite alphabet. A combinatorialist would call this unremarkable.

Claude Shannon would call it something worse: uninformative.

Information entropy measures average surprise per symbol. If you draw uniformly over all possible books, entropy is maximized:

$$ H = \log_2(25^{1{,}312{,}000}) = 1{,}312{,}000 \cdot \log_2 25 \approx 6{,}090{,}816 \text{ bits per book} $$

Therefore 6,090,816 bits per book is the theoretical maximum. Since we're sampling from a uniform distribution, where every book is equally likely, scanning through a book doesn't actually tell you anything since you knew it was in there before you scanned it.

And this is also why the librarians can't conduct search for interesting books. A meaningful book --- a correct proof of the Riemann hypothesis, say --- has low Kolmogorov complexity relative to its length. Structure, redundancy, the kind of thing you can compress. But it sits on a shelf next to $25^{1{,}312{,}000} - 1$ other books, almost all of which are just incompressible gibberish. The ratio of meaningful to meaningless is, however you want to define "meaningful," essentially zero.

And the cost of finding a meaningful book in the library is the same as just writing it yourself. And so The Library of Babel gives you nothing!

Borges intuited in 1941 what Shannon and Kolmogorov would formalize decades later: the complexity of a string is the length of the shortest program that produces it. A random string has complexity roughly equal to its own length --- you can't compress it, there's no structure, it doesn't mean anything in a very precise sense. The meaningful strings are the ones whose complexity is much less than their length.


Derivative of a regex

Regular expressions show up a lot in computer science. In automata theory, they are a formal notation for describing regular languages, which are sets of strings recognized by finite automata. They give an algebraic way to specify which strings a finite-state machine should accept. In practice, they are also a convenient tool for programmers to match patterns in text.

Formally, you may define a regex by fixing a finite alphabet $\Sigma$. A regular expression over $\Sigma$ is given by the grammar

$$ r ::= \emptyset \mid \varepsilon \mid a \mid r_1 + r_2 \mid r_1 r_2 \mid r^\ast \qquad \text{where } a \in \Sigma. $$

The language denoted by a regular expression $r$ is written $L(r)$ and defined by:

  • $L(\emptyset) = \emptyset$
  • $L(\varepsilon) = \lbrace \varepsilon \rbrace$
  • $L(a) = \lbrace a \rbrace$ for each $a \in \Sigma$
  • $L(r_1 + r_2) = L(r_1) \cup L(r_2)$
  • $L(r_1r_2) = L(r_1)L(r_2)$
  • $L(r^\ast) = (L(r))^\ast$

where for languages $A, B \subseteq \Sigma^\ast$, their concatenation is $AB = {xy : x \in A,\ y \in B}$. This $L(r)$ function defines the set of strings that are accepted by a regex $r$.

You can systematically determine this $L(r)$ function by converting $r$ into a finite automaton and running a string through it.

The standard method for this is to first convert the regex into a non-deterministic finite automata (NFA) by using Thompson's construction algorithm. For each form of regex we build a small NFA fragment with a single start state and single accept state.

Here are the base cases:

Thompson's construction base cases

For compound expressions (i.e. kleene closures, concatenation, etc.), you may wire sub-automata together with ε-transitions, which are "free" moves that don't consume any input:

Thompson's construction compound cases

But NFAs are hard to work with because at each step you have to track all the states the machine might currently be in, which as you can imagine, becomes even more difficult with ε-transitions. The standard fix is to turn the NFA into a determnistic finite automata (DFA) by treating each set of NFA states as a single DFA state.

Start with the ε-closure of the NFA's start state (that becomes the DFA's start state). For each DFA state $S$ and symbol $a \in \Sigma$, compute the next state by

$$ \delta_{\text{DFA}}(S, a) = \varepsilon\text{-closure}\left(\bigcup_{q \in S} \delta_{\text{NFA}}(q, a)\right) $$

and mark $S$ as accepting if it contains any accepting NFA state. Repeat until no new states appear.

And although this works, an NFA with $n$ states can produce a DFA with up to $2^n$ states, which as you can imagine is a lot of states.

For a long time, this was the standard way to convert a regex into a finite automaton. Eventually in 1964, Janusz Brzozowski introduced the notion of regex derivatives1, which let you construct a DFA directly from a regex.

We first say that a regex $r$ is nullable if the empty string $\varepsilon \in L(r)$. Again, we can check this recursively:

  • $\nu(\emptyset) = \emptyset$
  • $\nu(\varepsilon) = \varepsilon$
  • $\nu(a) = \emptyset$ for each $a \in \Sigma$
  • $\nu(r_1 + r_2) = \nu(r_1) + \nu(r_2)$
  • $\nu(r_1 r_2) = \nu(r_1) \cdot \nu(r_2)$
  • $\nu(r^\ast) = \varepsilon$

Here $\nu(r)$ returns $\varepsilon$ if $r$ is nullable and $\emptyset$ otherwise. Notice how we can use $\varepsilon$ and $\emptyset$ as stand-ins for true and false.

Now you can say that given a regex $r$ and a symbol $a \in \Sigma$, the derivative $\partial_a r$ is a new regex whose language is everything $r$ would match after consuming $a$:

$$ L(\partial_a r) = \lbrace w : aw \in L(r) \rbrace $$

We can compute this by structural recursion:

  • $\partial_a \emptyset = \emptyset$
  • $\partial_a \varepsilon = \emptyset$
  • $\partial_a b = \varepsilon$ if $a = b$, and $\emptyset$ if $a \neq b$
  • $\partial_a (r_1 + r_2) = \partial_a r_1 + \partial_a r_2$
  • $\partial_a (r_1 r_2) = (\partial_a r_1) r_2 + \nu(r_1)(\partial_a r_2)$
  • $\partial_a (r^\ast) = (\partial_a r), r^\ast$

The concatenation rule is especially interesting because if $r_1$ is nullable, then $a$ might be consumed by $r_2$ instead of $r_1$, so we need both branches, which is why $\nu$ shows up.2

These derivatives can be extended to strings: $\partial_\varepsilon r = r$ and $\partial_{wa} r = \partial_a(\partial_w r)$. Then a string $w$ is in $L(r)$ if and only if $\partial_w r$ is nullable.

Brzozowski derivatives directly gives us a regex's DFA. The states are regexes and the transitions are derivatives: start with $r$ as the initial state, and for each state $s$ and symbol $a$, the transition on $a$ goes to $\partial_a s$ (after simplification). A state is accepting if it's nullable.

Brzozowski proved that any regex has finitely many distinct derivatives up to a certain algebraic equivalence, so this process always terminates.

As an example, take $r = (a+b)^\ast a$ over $\Sigma = \lbrace a,b \rbrace$ which is the language of all strings that end in $a$.

We start with $r_0 = (a+b)^\ast a$ and compute its derivatives. For $\partial_a r_0$, we use the concatenation rule:

$$ \partial_a((a+b)^\ast a) = \partial_a((a+b)^\ast) \cdot a + \nu((a+b)^\ast) \cdot \partial_a(a) $$

Since $(a+b)^\ast$ is nullable, both branches survive:

$$ \partial_a((a+b)^\ast) = (\partial_a(a+b))(a+b)^\ast = (a+b)^\ast, \qquad \partial_a(a) = \varepsilon $$

$$ \partial_a r_0 = (a+b)^\ast a + \varepsilon $$

Call this $r_1$. For $\partial_b r_0$, the same concatenation rule gives us $\partial_b(a) = \emptyset$, so the second branch dies and we're left with $(a+b)^\ast a = r_0$. Reading $b$ loops back to the same state.

Now we compute the derivatives of $r_1 = (a+b)^\ast a + \varepsilon$:

$$ \partial_a r_1 = \partial_a((a+b)^\ast a) + \partial_a(\varepsilon) = (a+b)^\ast a + \varepsilon + \emptyset = r_1 $$

$$ \partial_b r_1 = \partial_b((a+b)^\ast a) + \partial_b(\varepsilon) = (a+b)^\ast a + \emptyset = r_0 $$

Since no new states appear, we're done. The resulting DFA has two states: $r_0 = (a+b)^\ast a$ (not nullable, so rejecting) and $r_1 = (a+b)^\ast a + \varepsilon$ (nullable, so accepting).

DFA for (a+b)*a

As it turns out, this is the absolute minimal DFA for the regex $(a+b)^\ast a$. Brzozowski proved in his paper that his method always give you the minimal DFA.


1

Brzozowski's original 1964 paper is short and worth reading: "Derivatives of regular expressions."

2

Brzozowski derivatives have seen renewed interest in recent years, particularly in functional programming, where the recursive structure maps cleanly onto pattern matching in languages like OCaml. For a more modern take, I'd recommend reading Owens, Reppy, and Turon, "Regular-expression derivatives re-examined" (2009).


expressing mitosis in λ-calculus

the λ-calculus only consists of 3 basic building blocks: variables, functions, and applications.

$$ e ::= x \mid \lambda x.e \mid e_1e_2 $$

the Y combinator lets you perform recursion in the λ-calculus by having a function apply a copy of itself to itself

the Y combinator is defined as

$$ Y = \lambda f.(\lambda x.f(xx))(\lambda x.f(xx)) $$

it satisfies $Y f = f(Y f)$. it works because $\lambda x.xx$, when applied to itself, copies itself forever, and $f$ gives the copying something to do.

i think the Y combinator kind of resembles biological cell division. you can think of $\lambda$ as the cell membrane, which is just a scope boundary that delimits where the genome is active; $f$ as gene expression (the proteins and RNA that read the genome and run the cell); and $x$ as the genome itself.

in this analogy, $xx$ would be the replication fork, where the genome is both the thing being read by the cell's machinery and the thing being copied for the daughter cell.

so $Y f = f(Y f)$ is a cell running its program and producing daughter cells that run the same program, which produce more daughter cells that run the same program.

the two $\lambda x.f(xx)$ in Y are asymmetric because one gets applied to the other. daughter cells are actually symmetric. the expressions are identical, though, so you can swap them and nothing changes. i don't think this matters much.

$\Omega = (\lambda x.xx)(\lambda x.xx)$ diverges because there's no base case. likewise, cells stop dividing because of telomere shortening, contact inhibition, and apoptosis. and cancer is what happens when those stop working.

p.s. i find it funny how $Y$ looks like a replication fork and $xx$ looks like the double helix...


Giant Steps and the chromatic group

Most Western harmonies are built on the circle of fifths. There are 12 notes in the chromatic scale, and moving up by a perfect fifth (7 semitones) cycles through all of them before you get back to where you started. Algebraically this is $\mathbb{Z}/12\mathbb{Z}$ with generator 7 --- since $\gcd(7, 12) = 1$, repeated addition of 7 mod 12 hits every element. The circle of fifths works because 7 generates the whole group.

In 1959, John Coltrane walked into a recording session and played this chord progression perfectly. It was unlike what anyone had ever seen before.

The key centers in Giant Steps by Coltrane are B, G, and E$\flat$, separated by major thirds (4 semitones). Starting from B (call it 0):

$$ 0 \xrightarrow{+4} 4 \xrightarrow{+4} 8 \xrightarrow{+4} 0 \pmod{12} $$

That's the subgroup $\langle 4 \rangle = {0, 4, 8} \cong \mathbb{Z}/3\mathbb{Z}$ inside $\mathbb{Z}/12\mathbb{Z}$. Instead of a generator that cycles through all 12 notes, he used one that only produces 3 and just rotated between them.

This matters because in circle-of-fifths harmony, adjacent keys share most of their notes. C major and G major differ by one note so modulations feel smooth. But B major, G major, and E$\flat$ major are about as far apart as you can get --- they share very few notes. Every time the key shifts you feel as if you're listening to a different song.

The chord progression:

$$ \text{B}\Delta \to \text{D}7 \to \text{G}\Delta \to \text{B}\flat 7 \to \text{E}\flat\Delta \to \text{F}\sharp 7 \to \text{B}\Delta $$

Each dominant 7th chord (D7, B$\flat$7, F$\sharp$7) resolves down a fifth to the next key center. So locally, every two-chord motion is a standard V-I resolution, which is the most basic move in jazz. But globally those resolutions are chained across the $\mathbb{Z}/3\mathbb{Z}$ cycle.

Tommy Flanagan, the pianist on the recording, audibly stumbles through his solo. He was one of the best pianists in jazz. Coltrane is the only one who sounds comfortable because he'd been woodshedding the cycle for months before the session. He had to build a completely new set of patterns just for this song.