Two-way transducers

Mikołaj Bojańczyk

zajęcia / courses

Two-way transducers

A deterministic two-way transducer consists of:
• an input alphabet $\Sigma$ ;
• an output alphabet $\Gamma$ ;
• a set of states $Q$ with a distinguished initial state and final state;
• a transition function

$\delta : Q \times (\Sigma \cup \set{\vdash,\dashv}) \to Q \times \set{-1,0,1} \times \Gamma^*$

The semantics of the transducer are defined similarly to Turing machines, as follows. (Actually, the model is equivalent to a Turing machine where there is one read-only input tape and one write-once output tape.) If the input word is $a_1 \cdots a_n$ , the we embellish it with start and end markers as follows:

$\vdash a_1 \cdots a_n \dashv$

The head of the automaton is then placed over the start marker $\vdash$ with the initial state. (For two-way automata, the head is over a letter, as opposed to one-way automata, where the head is between letters.) At any given moment, the automaton applies its transition function to its current state and the symbol under the head, yielding a new state, a direction to move the head, and some output letters that to be appended to the output. The output letters are used in chronological order, i.e. those which are output at the beginning of the run are at the beginning of the output, regardless of the position of the head when executing the transition. The run of the automaton might fail, either by moving out of the word (i.e. doing a $-1$ move on the start marker $\vdash$ , or doing a $1$ move on the end marker $\dashv$ ), or by entering an infinite loop; such failing runs do not produce any output, and therefore the semantics of the automaton is a partial function from $\Sigma^*$ to $\Gamma^*$ .

The run of the automaton ends when the final state is reached. If the final state is never reached, or if the automaton moves out of the input tape (by crossing a marker $\vdash$ or $\dashv$ in the wrong direction), then the automaton produces no input, i.e. its value is undefined. Therefore, the semantics of the machine is a partial function.

Typical things that can be done using a two-way transducer are duplication or reversing the input.

Past oracle

A two-way automaton with a past oracle is defined the same way as above, with the following difference. There is an additional DFA, which is called the past oracle. If the states of the past oracle are $P$ , then the transition function of the two-way automaton which uses this past oracle is of the form

$\delta : Q \times P \times (\Sigma \cup \set{\vdash,\dashv}) \to Q \times \set{-1,0,1} \times \Gamma^*$

The following theorem shows that past oracles can be eliminated. Other features that can also be eliminated are nondeterminism and future oracles.

Theorem. For every two-way automaton with a past oracle, there is a two-way automaton (without any oracle), which defines the same partial function from words to words.

Proof. The idea for this proof comes from Hopcroft and Ullman. Suppose that we have a two-way automaton with a past oracle, such that the states of the two-way automaton are $Q$ and the states of the past oracle are $P$ . The general idea for the simulation is straightforward: the simulating automaton knows both the state $q \in Q$ of the two-way automaton and the state $p \in P$ of the oracle. The question is how to maintain this information.

The key insight is to consider the graph which describes the states of the past oracle and how they are updated by the transition function of the past oracle. This graph looks likes this:

The vertices of the graph are configurations of the past oracle, i.e. pairs (state of the past oracle, column between positions in the word), and the edges correspond to transitions of the automaton. We number the columns beginning with 1. Because the past oracle is a deterministic automaton, the graph is a forest.

Let us define $p_i$ to be the state of the past oracle when in the $i$ -th column, i.e. after reading the first $i-1$ letters of the input word. The two-way automaton which uses the past oracle uses the state $p_{i}$ to make its decision when its head is over the $i$ -the position. Suppose that the head of the two-way automaton is over some position $i$ in the input word, and the state $p_{i}$ of the oracle is known, as indicated by a red circle in the following picture:

We want to show that the state of the oracle can be maintained when doing one transition of the two-way automaton. If the transition of the two-way automaton does not move the head, there is no problem. If the transition of the two-way automaton moves the head to the right, there is also no problem, since the transition function of the past oracle can be simply applied.

The issue is when the two-way automaton wants to move the head to the left, and we need to compute the state $p_{i-1}$ . Here is the solution. In terms of the forest in the pictures above, the state $p_{i}$ corresponds to the configuration $(p_{i},i)$ in the forest. We are going to inspect the subtree of this configuration, which contains the initial configuration $(p_1,1)$ , and therefore goes all the way to the left side of the input.

We start by moving the head one step to the left, which identifies all possible candidates for the predecessor configurations. Here is the picture, with the candidates being coloured yellow:

If there is only one yellow configuration, i.e. only one candidate for the predecessor, then we are done. The more interesting case is when there is more than one yellow configuration. In this case, we keep moving to the left, and colour all descendants of the yellow configuration (and therefore of the red configuration as well) using a green colour.

Two things may happen. In one case, we might reach a moment when all green configurations are descendants of a unique yellow configuration, as in this picture:

In this case, the unique yellow configuration is the one that we want to compute. The question is how to return to this unique configuration? The solution is this: since we have stopped in the first column when the yellow configuration was uniquely identified, in the previous column there were at least two candidates for the yellow configurations. Therefore, if we remember the previous column, we can start moving to the right until the first time that the whole subtree reaches a single node, this way we will reach the red configuration again. But in our state we can keep the yellow configuration that was the actual predecessor.

The remaining case is when we reach the first column at the beginning of the input. Here we do the same trick to return to the red configuration, and we can keep in our state which branch of the subtree corresponds to the computation of the past oracle. $\Box$

zajęcia / courses

Two-way transducers

Two-way transducers

Leave a Reply