Derivative of a regex

October 17, 2025

Regular expressions show up a lot in computer science. In automata theory, they are a formal notation for describing regular languages, which are sets of strings recognized by finite automata. They give an algebraic way to specify which strings a finite-state machine should accept. In practice, they are also a convenient tool for programmers to match patterns in text.

Formally, you may define a regex by fixing a finite alphabet $\Sigma$. A regular expression over $\Sigma$ is given by the grammar

$$ r ::= \emptyset \mid \varepsilon \mid a \mid r_1 + r_2 \mid r_1 r_2 \mid r^\ast \qquad \text{where } a \in \Sigma. $$

The language denoted by a regular expression $r$ is written $L(r)$ and defined by:

$L(\emptyset) = \emptyset$
$L(\varepsilon) = \lbrace \varepsilon \rbrace$
$L(a) = \lbrace a \rbrace$ for each $a \in \Sigma$
$L(r_1 + r_2) = L(r_1) \cup L(r_2)$
$L(r_1r_2) = L(r_1)L(r_2)$
$L(r^\ast) = (L(r))^\ast$

where for languages $A, B \subseteq \Sigma^\ast$, their concatenation is $AB = {xy : x \in A,\ y \in B}$. This $L(r)$ function defines the set of strings that are accepted by a regex $r$.

You can systematically determine this $L(r)$ function by converting $r$ into a finite automaton and running a string through it.

The standard method for this is to first convert the regex into a non-deterministic finite automata (NFA) by using Thompson's construction algorithm. For each form of regex we build a small NFA fragment with a single start state and single accept state.

Here are the base cases:

For compound expressions (i.e. kleene closures, concatenation, etc.), you may wire sub-automata together with ε-transitions, which are "free" moves that don't consume any input:

But NFAs are hard to work with because at each step you have to track all the states the machine might currently be in, which as you can imagine, becomes even more difficult with ε-transitions. The standard fix is to turn the NFA into a determnistic finite automata (DFA) by treating each set of NFA states as a single DFA state.

Start with the ε-closure of the NFA's start state (that becomes the DFA's start state). For each DFA state $S$ and symbol $a \in \Sigma$, compute the next state by

$$ \delta_{\text{DFA}}(S, a) = \varepsilon\text{-closure}\left(\bigcup_{q \in S} \delta_{\text{NFA}}(q, a)\right) $$

and mark $S$ as accepting if it contains any accepting NFA state. Repeat until no new states appear.

And although this works, an NFA with $n$ states can produce a DFA with up to $2^n$ states, which as you can imagine is a lot of states.

For a long time, this was the standard way to convert a regex into a finite automaton. Eventually in 1964, Janusz Brzozowski introduced the notion of regex derivatives¹, which let you construct a DFA directly from a regex.

We first say that a regex $r$ is nullable if the empty string $\varepsilon \in L(r)$. Again, we can check this recursively:

$\nu(\emptyset) = \emptyset$
$\nu(\varepsilon) = \varepsilon$
$\nu(a) = \emptyset$ for each $a \in \Sigma$
$\nu(r_1 + r_2) = \nu(r_1) + \nu(r_2)$
$\nu(r_1 r_2) = \nu(r_1) \cdot \nu(r_2)$
$\nu(r^\ast) = \varepsilon$

Here $\nu(r)$ returns $\varepsilon$ if $r$ is nullable and $\emptyset$ otherwise. Notice how we can use $\varepsilon$ and $\emptyset$ as stand-ins for true and false.

Now you can say that given a regex $r$ and a symbol $a \in \Sigma$, the derivative $\partial_a r$ is a new regex whose language is everything $r$ would match after consuming $a$:

$$ L(\partial_a r) = \lbrace w : aw \in L(r) \rbrace $$

We can compute this by structural recursion:

$\partial_a \emptyset = \emptyset$
$\partial_a \varepsilon = \emptyset$
$\partial_a b = \varepsilon$ if $a = b$, and $\emptyset$ if $a \neq b$
$\partial_a (r_1 + r_2) = \partial_a r_1 + \partial_a r_2$
$\partial_a (r_1 r_2) = (\partial_a r_1) r_2 + \nu(r_1)(\partial_a r_2)$
$\partial_a (r^\ast) = (\partial_a r), r^\ast$

The concatenation rule is especially interesting because if $r_1$ is nullable, then $a$ might be consumed by $r_2$ instead of $r_1$, so we need both branches, which is why $\nu$ shows up.²

These derivatives can be extended to strings: $\partial_\varepsilon r = r$ and $\partial_{wa} r = \partial_a(\partial_w r)$. Then a string $w$ is in $L(r)$ if and only if $\partial_w r$ is nullable.

Brzozowski derivatives directly gives us a regex's DFA. The states are regexes and the transitions are derivatives: start with $r$ as the initial state, and for each state $s$ and symbol $a$, the transition on $a$ goes to $\partial_a s$ (after simplification). A state is accepting if it's nullable.

Brzozowski proved that any regex has finitely many distinct derivatives up to a certain algebraic equivalence, so this process always terminates.

As an example, take $r = (a+b)^\ast a$ over $\Sigma = \lbrace a,b \rbrace$ which is the language of all strings that end in $a$.

We start with $r_0 = (a+b)^\ast a$ and compute its derivatives. For $\partial_a r_0$, we use the concatenation rule:

$$ \partial_a((a+b)^\ast a) = \partial_a((a+b)^\ast) \cdot a + \nu((a+b)^\ast) \cdot \partial_a(a) $$

Since $(a+b)^\ast$ is nullable, both branches survive:

$$ \partial_a((a+b)^\ast) = (\partial_a(a+b))(a+b)^\ast = (a+b)^\ast, \qquad \partial_a(a) = \varepsilon $$

$$ \partial_a r_0 = (a+b)^\ast a + \varepsilon $$

Call this $r_1$. For $\partial_b r_0$, the same concatenation rule gives us $\partial_b(a) = \emptyset$, so the second branch dies and we're left with $(a+b)^\ast a = r_0$. Reading $b$ loops back to the same state.

Now we compute the derivatives of $r_1 = (a+b)^\ast a + \varepsilon$:

$$ \partial_a r_1 = \partial_a((a+b)^\ast a) + \partial_a(\varepsilon) = (a+b)^\ast a + \varepsilon + \emptyset = r_1 $$

$$ \partial_b r_1 = \partial_b((a+b)^\ast a) + \partial_b(\varepsilon) = (a+b)^\ast a + \emptyset = r_0 $$

Since no new states appear, we're done. The resulting DFA has two states: $r_0 = (a+b)^\ast a$ (not nullable, so rejecting) and $r_1 = (a+b)^\ast a + \varepsilon$ (nullable, so accepting).

As it turns out, this is the absolute minimal DFA for the regex $(a+b)^\ast a$. Brzozowski proved in his paper that his method always give you the minimal DFA.

Brzozowski's original 1964 paper is short and worth reading: "Derivatives of regular expressions."

Brzozowski derivatives have seen renewed interest in recent years, particularly in functional programming, where the recursive structure maps cleanly onto pattern matching in languages like OCaml. For a more modern take, I'd recommend reading Owens, Reppy, and Turon, "Regular-expression derivatives re-examined" (2009).