Recall in our discussion of DFA equivalence, we were concerned with collapsing pairs of indistinguishable states. The canonical example of this were states of the following form:

Assuming that \(\Sigma = \{ 0, 1 \}\), we see that \(q_0\) and \(q_1\) go to the same state on every input If we arrive at either \(q_0\) or \(q_1\), our machine will accept and reject the same set of strings according to whatever transitions follow from \(q_2\) Therefore, it is safe to collapse these two states into one:

Let’s formalize this notion of indistinguishableness By doing this, we will arrive at the *Myhill-Nerode theorem*, a result that describes a fundamental property of regular languages We can use this result to justify our equivalence algorithm as well as show that a language is regular or non-regular

# Distinguishing Extensions and Equivalences

Consider the following DFA slightly modified from our original example.

Imagine two separate executions of this DFA To get to \(q_0\) in the first execution and \(q_1\) in the second execution, we would need to first read in some strings \(x \in \Sigma^*\) and \(y \in \Sigma^*\), respectively

Now, imagine the additional transitions that result from reading in an additional character in both of these executions If in both executions we read a \(0\), then we are in the situation where we converge on the same state Both executions will have the behavior at this point—rejection, since they are now both stuck in \(q_2\) In contrast, if we read a \(1\), then we are in the situation where the first execution (originally in \(q_0\)) will always accept by moving to \(q_3\) whereas the second execution (originally in \(q_1\)) will always reject by moving to \(q_4\)

Here, the additional character \(0\) or \(1\) that is read is called an *extension* of the strings \(x\) and \(y\) \(1\) is called a *distinguishing* extension to the strings \(x\) and \(y\) with respect to the DFA: by reading in a \(1\), the behavior of the executions starting in \(q_0\) and \(q_1\) differ In contrast, \(0\) is a *nondistinguishing* extension to \(x\) and \(y\) because in both executions, we would accept

We can generalize this notion of distinguishability to arbitrary strings \(z \in \Sigma^*\), rather than single characters as follows:

**Definition (Distinguishing Extension)**. *Consider a language \(L \subseteq \Sigma^*\) and strings \(x, y \in \Sigma^*\). A string \(z \in \Sigma^*\) is a *distinguishing extension* of the strings \(x\) and \(y\) if exactly one of \(xz\) and \(yz\) are in \(L\).*

**Definition (Indistinguishable)**. *Consider a language \(L \subseteq \Sigma^*\) and strings \(x, y \in \Sigma^*\). Two strings \(x\) and \(y\) are *indistinguishable*, written \(x \equiv_L y\) if there does not exist a \(z \in \Sigma^*\) that is a distinguishing extension of \(x\) and \(y\) with respect to \(L\).*

This relation, \(\equiv_L\), forms an *equivalence relation* between strings in \(\Sigma^*\). Recall that an equivalence is a particular relation satisfying three properties:

- Reflexivity (\(x \equiv_L x\)).
- Symmetry (\(x \equiv_L y \Rightarrow y \equiv_L x\)).
- Transitivity (\(x \equiv_L y \wedge y \equiv_L z \Rightarrow x \equiv_L z\)).

Clearly the first two properties hold—identical strings have the same behavior with respect to extension and the relation does not favor its left- or right-hand sides. To see that the third property holds, consider some extension \(w \in \Sigma^*\) to \(x\), \(y\), and \(z\). By our premises, we know that \(xw\) and \(yw\) have the same accept/reject behavior and \(yw\) and \(zw\) have the same accept/reject behavior. To conclude that \(xw\) and \(zw\) have the same behavior, we note that they must both have the same behavior as \(yw\).

# The Myhill-Nerode Theorem

With an equivalence relation, we can form *equivalence classes* between the members it relates. Recall that an equivalence class is a set of elements closed under the equivalence relation. That is, we establish that two elements are equivalent, \(x \equiv_L y\), and then use the properties of the equivalence relation to find additional equivalent elements until there are no more to add to the set.

The Myhill-Nerode theorem characterizes the regular languages in terms of the equivalence classes generated by \(\equiv_L\).

**Theorem (Myhill-Nerode)**. *A language \(L\) is regular if and only if \(\equiv_L\) has a finite number of equivalence classes. Furthermore, the number of equivalence classes corresponds to the number of states in the minimal DFA that recognizes \(L\).*

*Proof.* In the “if” direction, we have a regular language \(L\) and from it must deduce that \(\equiv_L\) has a finite number of equivalence classes. Consider a DFA \(D = (\Sigma, Q, \delta, q_0, F)\) that recognizes \(L\). Define the following family of sets of strings for each \(q_i \in Q\):

\[ W_i = \{ w \in \Sigma^* \;|\; q_i \in Q, \delta^*(q_0, w) = q_i \}. \]

That is \(W_i\) is the set of strings that cause \(D\) to transition from \(q_0\) to \(q_i\). Note that every string belongs to exactly one of these \(W_i\), *i.e.*, each \(w \in \Sigma^*\) sends \(D\) to a single \(q_i\) by the definition of the transition function \(\delta\).

Now, consider an arbitrary \(q_i \in Q\), its related \(W_i\), and any two \(x, y \in W_i\). Since \(\delta^*(x, q_0) = \delta^*(y, q_0) = q_i\) (\(x\) and \(y\) arrive at the same state \(q_i\) in \(D\)), then we know there is no distinguishing extension \(z \in \Sigma^*\) for the strings \(x\) and \(y\). Therefore, \(x \equiv_L y\) and we can therefore conclude that \(W_i\) is a subset of some equivalence class of \(\equiv_L\). From this and the fact that every possible string belongs to exactly one of these sets, we know that the size of the family of sets \(W_i\), *i.e.*, \(|Q|\), is an upper bound on the total number of equivalence classes of \(\equiv_L\). Because \(|Q|\) has finite size, we can finally conclude that the number of equivalence classes is finite. (Note that we are unable to conclude that each \(W_i\) is *equal* to some equivalence class of \(\equiv_L\), rather than a subset, because multiple \(W_i\) might map onto the same equivalence class.)

In the “only if” direction, we know that \(\equiv_L\) has a finite number of equivalence classes and from there must deduce that \(L\) is regular. To do this, we construct a DFA \(D = (\Sigma, Q, \delta, q_0, F)\) from \(\equiv_L\) where \(L(D) = L\). Let \(W_i\) denote the \(i\)th equivalence class of \(\equiv_L\).

- \(Q = \{ q_i \;|\; W_i \subseteq \Sigma^* \}\).
- \(\delta(q_i, a) = q_j\) where \(W_i\) corresponds to the equivalence class for the string \(x \in W_i\), and \(W_j\) corresponds to the equivalence class for the string \(xa\). Note that it doesn’t matter which \(x\) is chosen; by the definition of \(\equiv_L\), any extension of \(x, y \in W_i\) has the same behavior, so choosing either \(x\) and \(y\) results in \(W_j\)
- \(q_0 = q_i\) where \(\epsilon \in W_i\),
*i.e.*, we start in the equivalence class that contains the empty string. - \(F = \{ q_i \;|\; w \in W_i \wedge w \in L \}\).

Finally, note that in the “if” direction, we showed the sets of strings derived from a DFA by considering distinguishing extensions is a subset of an equivalence class in \(\equiv_L\). Furthermore, there may be more such sets than there are equivalence classes. The “only if” direction shows that we can directly derive a DFA from the equivalence classes of \(\equiv_L\). From these facts, we can conclude that the DFA derived from the equivalence classes of \(\equiv_L\) is the *minimal* DFA for the language \(L\).

# Applications

The Myhill-Nerode theorem establishes a direct correspondence between the equivalence classes of \(\equiv_L\) and the states of the (minimal) DFA that recognizes \(L\). As an immediate corollary, the table-filling algorithm for minimization that we discussed in the previous reading computes the equivalence classes of \(\equiv_L\) and therefore, is minimal.

In addition to this, we can use the Myhill-Nerode theorem to establish whether a language \(L\) is regular. To do this, we either construct the finite equivalence classes of \(\equiv_L\) (to show regularity) or show that the equivalence classes of \(\equiv_L\) is infinite (to show non-regularity).

**Claim 1**. Consider the language \(L = \{ w \;|\; \text{$w$ contains an even number of $0$s} \}\). \(L\) is regular.

*Proof.* Rather than building a DFA, we can show this directly using the Myhill-Nerode theorem. We enumerate possible strings, using distinguishing extensions to group them into equivalence classes:

The strings \(\epsilon, 1, 00, 010, \ldots\): these strings have the same acceptance/rejection behavior on all possible extensions,

*e.g.*, \(\epsilon\) (accept), \(0\) (reject), \(1\) (accept), …The strings \(0, 01, 10, 110, \ldots\): these strings have the same acceptance/rejection behavior on all possible extensions,

*e.g.*, \(\epsilon\) (reject), \(0\) (accept), \(1\) (reject), …

Formally, we define two sets of strings:

- \(W_1 = \{ w \mid \text{$w$ has an even number of $0$s} \}\).
- \(W_2 = \{ w \mid \text{$w$ has an odd number of $0$s} \}\).

These two sets are complete—they encompass all strings in \(\Sigma^*\). Furthermore, any \(w \in \Sigma^*\) is in exactly one of \(W_1\) and \(W_2\). Finally, each set is an equivalence class of \(\equiv_L\)—any extension of pairs of strings in one of the \(W_i\) result in identical acceptance/rejection behavior with respect to \(L\). Thus, we can conclude \(W_1\) and \(W_2\) are the equivalence classes of \(\equiv_L\) and thus \(L\) is regular.

Note that these two classes correspond to the following minimal DFA:

Which we recognize as the correct DFA for \(L\).

Rather than using the pumping lemma, we can also use the Myhill-Nerode theorem to show that \(L\) is regular. Suppose that \(L\) is regular, then the theorem says that \(\equiv_L\) has a finite number of equivalence classes. Therefore, we show that \(\equiv_L\) actually has an infinite number of equivalence classes, a *proof by contradiction*.

**Claim 2**. *Consider the language \(L = \{ 0^n1^n \;|\; n \geq 0 \}\). \(L\) is not regular.*

*Proof.* Consider the string \(0^n\) for \(n \geq 0\). For each such string, there is a single extension in which \(L\) contains the resulting string—\(1^n\)—and all other extensions result in strings not in \(L\). Therefore \(\equiv_L\) contains one equivalence class for each \(n \geq 0\) corresponding to the starting string \(0^n\). This is an infinite number of such classes. Therefore \(L\) must not be regular.