As a very simple example, let's define an instruction set architecture
with just one instruction, performed on values in a register file.
Each instruction has two source operands and a destination operand.
Thus, an opcode consists of three fields - `srca`, `srcb`
and `dst`. For simplicity, we'll make the operation addition.
Here's what the ISA model might look like:

scalarset REG undefined; typedef WORD array 0..31 of boolean; module main() { r : array REG of WORD; srca, srcb, dst : REG; opra, oprb, res : WORD; opra := r[srca]; oprb := r[srcb]; res := opra + oprb; next(r[dst]) := res; }

We've declared a type `REG` to represent a register index, a type
`WORD` to represent a data word (in this case a 32 bit
word). Notice that `REG` is a undefined scalarset. That is, we
don't say, for the moment, how many registers there are.

Notice, also, that we've given names to the operand values `opra`
and `oprb`, and to the operation result `res`. It wasn't
necessary to do this. That is, we could have written:

next(res[dst]) := r[srca] + r[srcb];

This would have been more concise. However, it's convenient to give the intermediate quantities names, since we will use these later in writing refinement relations. Now let's implement this abstract model with a simple 3 stage pipeline, where the first stage fetches the operands, the second stage does the addition, and the third stage stores the result into the register file. The implementation has a register bypass path that forwards the results directly from later stages of pipe to the operand fetch stage.

/* the implementation */ /* implementation register file */ ir : array REG of WORD; /* pipe registers */ stage1 : struct { valid : boolean; dst : REG; opra, oprb : WORD; } stage2 : struct{ valid : boolean; dst : REG; res : WORD; } /* read stage : fetch operands with bypass */ next(stage1.opra) := case{ stage1.valid & srca = stage1.dst : alu_output; stage2.valid & srca = stage2.dst : stage2.res; default : ir[srca]; }; next(stage1.oprb) := case{ stage1.valid & srcb = stage1.dst : alu_output; stage2.valid & srcb = stage2.dst : stage2.res; default : ir[srcb]; }; next(stage1.dst) := dst; init(stage1.valid) := 0; next(stage1.valid) := 1; /* alu stage: add operands */ alu_output : WORD; alu_output := stage1.opra + stage1.oprb; next(stage2.res) := alu_output; next(stage2.dst) := stage1.dst; init(stage2.valid) := 0; next(stage2.valid) := stage1.valid; /* writeback stage: store result in r */ if(stage2.valid) next(ir[stage2.dst]) := stage2.res;

Note that each stage has a `valid` bit, which says
whether there is an instruction in it. Initially, these
bits are zero.

Now, we would like to write two refinement maps - one which defines
the correct operand values in `stage1` and the other which
defines the correct result at the adder output. To do this, we
add some auxiliary state information to each stage that remembers
the correct operand and result values for the given stage, as
computed by the abstract model. Let's add the following component
to `stage1` :

stage1.aux : struct{ opra, oprb, res : WORD; }

Now, let's add some code to record the correct operand and result values for the first stage:

next(stage1.aux.opra) := opra; next(stage1.aux.oprb) := oprb; next(stage1.aux.res) := res;

That is, we simply record the abstract model's values for
`opra`, `oprb` and `res`. Note, this is why we
gave them explicit names in the abstract model. This is all
the auxiliary information we'll need to state our refinement
relations. However, for e deeper pipeline, we could just
pass the auxiliary information down the pipe along with
the instructions, as follows:

next(stage2.aux) := stage1.aux; ...

Now, we can state the two refinement maps in terms of the auxiliary state information. For the operands, we specify that, if stage 1 has a valid instruction, then its operands are equal to the correct operand values:

layer lemma1: { if(stage1.valid) stage1.opra := stage1.aux.opra; if(stage1.valid) stage1.oprb := stage1.aux.oprb; }

For the ALU results, we specify that, if stage1 has a valid instruction, then the ALU output is equal to the correct result value:

layer lemma2: if(stage1.valid) alu_output := stage1.aux.res;

We would like to show, of course, the correct operands imply correct
results, and conversely, correct results imply correct operands.
However, since we have an arbitrary number of registers to deal with,
we'll need to break `lemma1` into cases as a function of which register
is being read. The only problem we have in doing this is that we don't know
which registers were the source operands for the instruction in stage
one, because our implementation does not store this information. This
problem is easily solved, however, since we can store the information
in our auxiliary state. So let's add two components to the auxiliary
state:

next(stage1.aux.srca) := srca; next(stage1.aux.srcb) := srcb;

Of course, we have to remember to declare these components
in our auxiliary structure (their type is `REG`).
Now, we split the operand refinement maps into cases based
on which are the actual source registers of the instruction
in stage 1. For the `srca` operand, we have:

forall(i in REG) subcase lemma1[i] of stage1.opra//lemma1 for stage1.aux.srca = i;

Similarly, for `srcb`, we have:

forall(i in REG) subcase lemma1[i] of stage1.oprb//lemma1 for stage1.aux.srcb = i;

This way, we only have to consider one register at a time, so we can
reduce an arbitrary number of registers to just one, for each case.
Note, we don't need to do this for `lemma2`, the result refinement
maps, since it doesn't depend on the register file. It depends only
on the operands.

Now we're ready to prove the various cases of our lemmas. For `lemma1`,
we say:

forall(i in REG) using res//free, alu_output//lemma2 prove stage1//lemma1[i];That is, we assume that the ALU output is correct, and show that (future) operands we obtain are correct. Notice that there are several paths that an ALU result might take to get back to the operand registers in stage 1. It might follow the bypass path, or it might get stored in register

To prove the result lemma (`lemma2`), we assume that operands
entering the ALU are correct:

using opra//free, oprb//free, stage1//lemma1 prove alu_output//lemma2;Note, in this case, we don't care what the correct operands actually are - we only care that the abstract model and the implementation agree on them (

Now, run this example. You'll notice that there are 32 instances to prove for each of

stage1.opra[i]//lemma1[0] stage1.oprb[i]//lemma1[0] alu_output[i]//lemma2where

stage1.opra[0]//lemma1[0]and try to verify it. You should get a counterexample. In this counterexample, the initial value of

When there is a nondeterministic choice in an abstract model, we sometimes have to provide a ``witness function'' for this choice. That is, as a function of the implementation behavior, we plug in a suitable value in the abstract model. In this case, since the initial value in the specification is complete undefined, we are free to plug in any value we like. So let's write the following:

init(r) := ir;

That is, we just set the initial value of the abstract model register file to be the same as the initial value of the implementation register file. You might be wondering why we have to do this. That is, why can't SMV figure out what the correct initial value of the register file is. The answer is that it could, for any given property. However, it might use different initial values to prove different properties. As a result, even though we would have ``verified'' all the properties, there would be no single choice that makes all the properties true. Thus, for reasons of soundness, SMV requires you to fix the choice once and for all, and then verifies all the properties for the particular choice you make.

In any event, let's open
the new version, with the witness function,
and try again to verify `stage1.opra[0]//lemma1[0]`. You should find that
the property is true. Look in the Cone pane, and observe that it contains
only 11 boolean state variables. This is because we are considering only
registers `r[0]` and `ir[0]`, and only bit 0 of the data path.
We obtain only bit 0 of the data path since neither the abstract model
ALU nor the implementation ALU is in the cone. The former was eliminated
by freeing `res`, while the latter was eliminated by using `lemma2`
to drive the ALU output in the implementation.

Now select property `alu_output[0]//lemma2`. The cone is rather large
in this case (66 state variables) because bit 0 depends in this case on
all the other bits of the data path through the ALU. (This is because
bit 0 is the most significant bit,and depends on all the others through the
carry chain). However, notice the register files are not in the cone in this
case, because we have freed `opra` and `oprb`, and we have driven
the implementation operand registers using `lemma1`.

Go ahead and verify property `alu_output[0]//lemma2`. You should find that
it checks fairy quickly in spite of the large number of state variables.
This is because the ALU operation is addition, and SMV succeeds in finding
an ordering of the BDD variables that makes the addition function compact.
In fact, select `Prop|Verify All` to verify all the remaining properties.
On my machine, this takes a little under eight seconds.

On the other hand, if we had had a multiplier in the ALU the story would have been different. This is because there is no BDD variable ordering that makes this function compact. The verification of multipliers is beyond the scope of this tutorial. There is, however, a way of separating the problem of arithmetic verification from the processor verification problem. In this way, we can verify the processor design independent of the ALU function. Then we can plug in any ALU function we like.