From aperiodic to star-free

Mikołaj Bojańczyk

zajęcia / courses

In this page, we show the implication from aperiodic monoids to star-free expressions in the Schützenberger theorem. The implication is stated in the following lemma.

Lemma 0. Let $h : A^* \to M$ be a morphism into a finite aperiodic monoid. Then for every $m \in M$ , the inverse image $h^{-1}(m)$ is definable by a star-free expression.

The rest of this page is devoted to proving the above lemma. Fix the morphism $h$ as in the lemma. Let us use the name type of $w$ for the image $h(w)$ . Stated in this terminology, we need to show that for every $m$ in the monoid, the set of words of type $m$ is definable by a star-free expression. The proof is by induction on the position of $m$ in the $\Jj$ -ordering, with simpler $\Jj$ -classes coming first in the induction. (Recall that simpler $\Jj$ -classes are those of infixes, in particular the simplest $\Jj$ -class is the one that contains the monoid identity.)

Induction base. The induction base is the case when $m$ belongs to the simplest $\Jj$ -class, namely the one which contains the identity. By Lemma 4 in the section on Green’s relations, the $\Jj$ -class of the identity is a group. Since aperiodic monoids cannot contain non-trivial groups, it follows that the $\Jj$ -class of the identity consists only of the identity. This means that the type of a word is the identity if and only if every letter used by that word has the identity type. In other words, the words whose type is the identity are defined by the star-free expression

$\neg (\neg \emptyset \cdot B \cdot \neg \emptyset)$

where $B$ is those letters in the alphabet that have non-identity type.

Induction step. Fix some $\Jj$ -class $J$ in the monoid, and assume that the inverse image $h^{-1}(m)$ is star-free for every $m$ which has a $\Jj$ -class strictly simpler than $J$ . We will now prove that the same holds for every $m \in J$ .

Recall that an $\Hh$ -class is an intersection of an $\Ll$ -class and an $\Rr$ -class.

Lemma 2. In an aperiodic monoid, every $\Hh$ -class has one element.

Proof. Stated differently, the lemma says that if $m,n$ are both $\Rr$ – and $\Ll$ -equivalent, then they are equal. Let $m,n$ be such elements, i.e. there exist some $x,y$ in the monoid such that

$mx = n \qquad yn =m$

Therefore $x^i m y^j$ is $m$ when $j=i$ and is $n$ when $j=i+1$ . If $i$ is large enough, then aperiodicity says that $y^i = y^{i+1}$ and therefore $m=n$ . $\Box$

Define $R_m \subseteq A^*$ to be the set of words which have a prefix whose type is $\Rr$ -equivalent to $m$ . Likewise define $L_m$ to be the set of words which have a suffix whose type is $\Ll$ -equivalent to $m$ . Clearly every word with type $m$ belongs to both $R_m$ and $L_m$ .

Lemma 3. For every $m\in J$ , the languages $R_m$ and $L_m$ are star-free.

Proof. We only do the proof for $R_m$ . For a word $w \in R_m$ , consider the shortest prefix that has the same $\Jj$ -class as $m$ . This prefix exists by membership in $R_m$ , and because the $\Rr$ -class of a monoid element is included in its $\Jj$ -class. The shortest prefix is nonempty, because $m$ is not $\Jj$ -equivalent to the identity, and therefore this prefix has to be of the form $ua$ with $u$ having a type that is strictly $\Jj$ -simpler than $m$ . From the Eggbox Lemma in the section on Green’s relations it follows that the type of $ua$ is $\Rr$ -equivalent to $m$ . From these observations, it follows that

$R_m = \bigcup_{n,a} h^{-1}(n) \cdot a \cdot \neg \emptyset$

where the union ranges over $n$ that are strictly $\Jj$ -simpler than $m$ and letters $a \in A$ such that $na$ is $\Rr$ -equivalent to $m$ . The above is definable by a star-free expression, since the induction assumption can be applied to yield star-free expressions for the languages $h^{-1}(n)$ . $\Box$

The following lemma completes the induction step in the proof of Lemma 0.

Lemma 4. For every $m \in J$ , the set of words with type $m$ is equal to

$L_m \cap R_m - \bigcup_{\substack{n \in M\\a \in A}} L_n \cdot a \cdot (\neg \emptyset)$

where the union ranges over $n,a$ such that $n$ is $\Jj$ -equivalent to $m$ but $na$ is not.

Proof. Let us begin with the left-to-right inclusion. If $w$ has type $m$ , then it clearly belongs to both $L_m$ and $R_m$ . We need to show that $w$ does not belong to the union in the expression from the statement of the lemma. In other words, we need to show that there is no infix of $w$ which is of the form $va$ such that the type of $v$ is $\Ll$ -equivalent to some $n$ such that $n$ is $\Jj$ -equivalent to $m$ but $na$ is not. If there was such an infix, then the type of $v$ could be decomposed as $xn$ for some $x$ in the monoid, and therefore $J$ would be strictly $\Jj$ -simpler than the type of $va$ , and therefore the type of $w$ would be in a different $\Jj$ -class than $m$ . This completes the proof of the left-to-right inclusion in the lemma.

Let us do the right-to-left inclusion. Suppose that $w$ belongs to the expression in the statement of the lemma. Membership of $w$ in $L_m$ and $R_m$ says that the type of $w$ is a prefix and suffix of $m$ . Therefore, by the Eggbox Lemma, if we prove that the type of $w$ of is $\Jj$ -equivalent to $m$ , we will know that the type of $w$ is in the same $\Hh$ -class as $m$ , and therefore it must be equal to $m$ by Lemma 2. So it remains to show that the type of $w$ is in $J$ . But if $m$ were strictly $\Jj$ -simpler than the type of $w$ , then the shortest prefix with type not $\Jj$ -simpler than $m$ would belong to $L_n a$ as in the statement of the lemma. $\Box$

zajęcia / courses

From aperiodic to star-free

Leave a Reply