Mikołaj Bojańczyk

Here we study weighted automata which are finite, in the sense that the input alphabet is finite and the state space is of finite dimension. We also assume that the field is the field of reals.

A finite automaton can be represented in a finite way, call this the matrix representation. The state space must be isomorphic to \field^n, since these are the vector spaces of finite dimension. Therefore, it suffices to store n. For each input letter a, the transition function is a linear function

    \[\delta_a : \field^n \to \field^n\]

which can be stored as a matrix. (Here we assume that the entries of the matrix can be represented, which means either that we restrict to rational numbers, or use some computation model that can deal directly with real numbers.)

 


 

Computing reachable states

Let us begin with a simple algorithm for finite weighted automata – we want to compute the linear combinations of reachable states. This is a simple saturation procedure. We begin with Q_0 \subseteq Q being the singleton of the initial state. Then, assuming that Q_i has already been defined, we define Q_{i+1} to be the vector space spanned by

    \[Q_i \cup \bigcup_{a \in \Sigma} \delta_a (Q_i)\]

This way we get a growing chain of linear subsets

    \[Q_1 \subseteq Q_2 \subseteq \cdots \subseteq Q.\]

Since the dimension cannot grow indefinitely, this sequence must stabilise at some point, and this point is the set of reachable states. Furthermore, if the original automaton is given by a matrix representation, then one can compute the sets Q_i.

 


 

 

Computing automaton equivalence

Here is a corollary of the reachability algorithm. Suppose we want to test if two automata define the same function. We can define the product automaton, with states being pairs of stats from the two automata, and the output function being defined as

    \[(q_1,q_2) \qquad \mapsto \qquad F_1(q_1) - F_2(q_2)\]

with F_1,F_2 being the output functions of the two original automata. In the product automaton we compute the linear combinations of reachable states. The automata were equivalent if and only if, when restricted to those states, the new output function is zero everywhere.

 


 

 

Computing the minimal automaton

Here we show that minimisation is effective, i.e. if one gets a finite weighted automaton on input, one can produce (even in polynomial time) the minimal automaton. Observe that if a function is recognised by a finite dimensional weighted automaton, then its syntactic automaton has finite dimension. This is because linear functions cannot increase dimension.

Consider a weighted automaton \Aa with a state space Q of finite dimension. For a number n \in \Nat, we define states q,p to be n-equivalent if

    \[qw = pw\]

holds for all input words of length at most n, where qw if the output of the automaton after reading word w assuming that the initial state was changed to q. This equivalence relation can be seen as a subset of

    \[E_n \subseteq Q \times Q.\]

. By linearity of the automaton, the subset is linear, i.e. it is closed under + and multiplying by scalars. Therefore we have a sequence of subsets

    \[Q \times Q \supseteq E_0 \supseteq E_1 \supseteq E_2 \supseteq \cdots\]

which are linear. Since each subset has a dimension, which is at most double the dimension of Q, the sequence above must stabilise at some equivalence relation, call it  E_*, which is the Myhill-Nerode equivalence relation. Furthermore, a representation of this E_* can be computed in time polynomial in the dimension of the original automaton. The remaining description is essentially book-keeping: we prove that there is a well-defined quotient automaton, and that a matrix representation of it can be computed based on a matrix representation of the original automaton.

Let [Q] be the set of equivalence classes of Q with respect to E_*, and let

    \[h : Q \to [Q]\]

be the function which maps a state to its equivalence class. Because E_* is an equivalence relation and a linear set, the set [Q] is a vector space, and h is a linear function. What is the dimension of [Q] and how do we represent it and the function h, assuming that Q = \field^n for some n? One solution is the following. Begin with B being some basis of Q, e.g. the n vectors which have 1 on a unique coordinate. Then, iterate the following: check if there is some b \in B which is equivalent under E_* to a linear combination of other elements of B. If there is no such b, then return B, otherwise remove one such b from B and continue the process. At the end we get a subset B \subseteq Q, such that every element of Q is equivalent under E_* to a linear combination of vectors from B. In particular, [Q] is isomorphic to \field^B. Out of this process we also get a matrix representation of the function h, seen as a linear function \field^n \to \field^B.

If we take two states that are equivalent under E_*, and apply to them a transition function \delta, then the results are also equivalent. This means that for every input letter a there exists a function [\delta]_a which makes the following diagram commute:

    \[\xymatrix{Q \ar[d]_h \ar[r]^{\delta_a} & Q \ar[d]^h \\ [Q] \ar[r]_{[\delta]_a} & [Q]}\]

The function [\delta]_a is also linear, because its graph, as a subset of [Q] \times [Q], is simply the image of the graph of \delta_a under h applied coordinatewise. In particular, we can compute a matrix representing each function [\delta]_a.  A similar argument proves that there is a linear function [F] which makes the following diagram commute

    \[\xymatrix{Q \ar[dr]^{F} \ar[d]_{h} \\ Q \ar[r]_{[F]} & \field}\]

and a matrix representing it can be computed.

This finishes the description of the algorithm for minimising finite weighted automata.

 

 

 

Leave a Reply

Your email address will not be published.