Promela/SPIN Assignment (2016)

This year's assignment is different from the ones in the previous years. It aims to demonstrate that, despite its inherent limitations, formal verification can sometimes be useful in practical problems. To this end, we will focus on the so-called reliable broadcast problem encountered in distributed systems. Your task will be to model in Promela algorithms developed for the problem and to prove (or disprove) in SPIN certain properties of the algorithms. You will also be required to present your studies in a short report.

Should you have any questions or suggestions, please contact the author of the assignment, Konrad Iwanicki.


Table of contents

  1. Input
  2. Problem
  3. Detailed requirements
  4. FAQ

1. Input

To help you develop your solution (and actually increase the chance that you are able to verify anything), a package of template files has been prepared. It can be downloaded here. We will refer to the files throughout the text below. Attention: Some of the *-test.pml files have changed on March 24. Please, redownload them if you have the old ones.


2. Problem

Let us assume that we have NUM_PROCESSES processes, numbered from 0 to NUM_PROCESSES - 1. The processes share no memory and instead communicate via messages. One particular communication pattern that we consider is broadcasting: a process broadcasting a message wants to have that message delivered to all processes (including itself). However, each process may fail at an arbitrary moment of its execution. Upon failure, the process stops forever and other processes are not aware of this (if necessary, they may try to discover this by communicating with the process). The problem is what delivery guarantees for broadcasting in the presence process of failures we are able to provide and how to provide these guarantees. We will study two algorithms with different guarantees.

2.1. Warm-up: Mailboxes

For inter-process communication, we will utilize so-called mailboxes. A mailbox is a container for messages. Each process owns exactly one mailbox. Each mailbox is owned by exactly one process. To send a message to process B, process A drops the message to B's mailbox by invoking function mb_send(#B, msg), where #B is B's process/mailbox number and msg is a variable containing the message. The mb_send function must never block. To receive a message, process B picks one of the messages from its mailbox by using function mb_recv(msg) where msg is the variable to which the received message is to be written. If there is no message in B's mailbox, the mb_recv function must block; otherwise, it must not block (i.e., the corresponding Promela statement must be executable).

Your task is to model in Promela communication via mailboxes that satisfies all of the following properties and to verify in SPIN that these properties indeed hold:

More specifically, there are three files related to mailboxes:

To sum up, your have to write: (1) in file mailbox-impl.pml, an implementation of mailboxes satisfying the above properties; (2) in file mailbox-ltl.pml, LTL formulas formalizing these properties. The implementation of mailboxes should scale with the number of processes and be as efficient as possible, because it will be used by the broadcast algorithms. The formulas, in turn, should be verifiable in SPIN.

Considering the non-blocking behavior of mb_send, you have to guarantee this behavior only in the test program as well as in the programs for the broadcast algorithms. Mind, however, that in all these programs mb_send must not block even if one increases the value of the NUM_PROCESSES constant in some execution of a program.

Regarding LTL formulas, they have to formalize the above properties only in terms of the test program. In particular, they can be somewhat simplified considering the behavior of the program. Such simplifications must be explained in the report, though.

In any case, the implementation of mailboxes must satisfy the three above properties even for usage patterns different from the one in the test program. You may thus want to use other test programs to verify the properties. However, you need not include these programs in the solution you submit for grading.

When it comes to verification, you have to provide a script for each of the properties (more information below). You are also required to verify all the properties. If you have a state-explosion problem, you may want to narrow down the possible executions of the test program by allowing only one process to be a sender and/or one process to be a receiver. This is done by passing to the spin command either -DONE_SENDER=<proc_no> or -DONE_RECEIVER=<proc_no>, respectively.

2.2. Broadcast algorithms

We will consider two broadcast algorithms: basic broadcast (BB) and reliable broadcast (RB).

BB directly utilizes mailboxes and works as follows:

    bb_broadcast(msg):
      for i in (0..NUM_PROCESSES-1):
        mb_send(i, msg)

    bb_deliver(msg):
      mb_recv(msg)

In contrast, RB is built on top of BB, uses an additional delivery set (rb_dset), and operates as follows:

    rb_init():
      rb_dset = ∅

    rb_broadcast(msg):
      bb_broadcast(msg)
      
    rb_deliver(msg):
      do:
        bb_deliver(msg)
      while msg ∈ rb_dset
      rb_dset = rb_dset ∪ { msg }
      post rb_continuation(msg)
    
    rb_continuation(msg):
      bb_broadcast(msg)          

The rb_init function is invoked once upon a start of a process. In our template code, it is omitted as rb_dset is by default empty upon creation. The post operator, in turn, postpones the execution of the rb_continuation function until the present function, rb_deliver, returns. In effect, rb_continuation is invoked as soon as rb_deliver returns, that is, immediately after a message has been delivered to the application invoking rb_deliver.

Your first task related to the algorithms is to model them under process failures, given the following template files:

2.3. Properties to verify

Your second task is to check which of the properties below each of the two algorithms satisfies:

Again, the properties must be formulated in LTL and verified in SPIN only for the test program. In particular, the program makes a number of simplifying assumptions that minimize the state space. Note that it is also your task to clarify some aspects of the properties.


3. Detailed requirements

Your solution must be submitted as one .tgz file with its name being your login at students (e.g., ab123456.tgz). After unpacking the file, a single directory with its name equal to the login must be created (e.g., ab123456/).

3.1. Directory structure

The directory must have the following structure:

3.2. Report

The report should be in PDF. It should be brief.

For each property, it should contain an explanation of how the property is formalized in LTL, why in that particular way, and what simplifying assumptions the formalization makes.

For the broadcast algorithms, there should also be a table summarizing which properties hold in each of the algorithms: a column of such a table should correspond to an algorithm, a row to a property, and thus a cell should contain information on whether the corresponding property holds for the corresponding algorithm. If there were problems verifying some properties (e.g., a state explosion necessitated using the -DONE_BROADCASTER option), they should be described in footnotes of the table. Similarly, if some property holds for some options passed to pan but not for others, this behavior should be explained. (No trail is needed for such a property.) Finally, for each property that does NOT hold, there should be a description of a counter example contained in the corresponding trail file. (This aims to teach you how to interpret .trail files.)

3.3. Final remarks

Construct your LTL formulas in a structural way. This means that you can use #define to name certain (parametrized) predicates and use these predicates in formulas. You can also combine multiple predicates into larger predicates, and the like. In general, you should try to make the formulas as much self-explanatory as possible.

When verifying the properties, use NUM_PROCESSES selected appropriately. For example, NUM_PROCESSES = 1 is not appropriate to verify many of the properties.

Try to verify the properties without the options narrowing the search space. Only if this fails, use these options.

Make your solution consistent with the above requirements. Solutions not meeting these requirements may be refused to be graded!

If you discover that you would like to study distributed systems in more depth, consider registering to the Distributed Systems course next year.


FAQ

All questions should be directed to Konrad Iwanicki.

I really need to modify a *-test.pml file. Am I allowed to?
No. If you really feel that you should modify the file, make a copy with a different name and explain in the report why you created the file and which properties you verify on the modified file.
The mb_at_most_once_delivery property and the set operations in RB require an equality test on messages. When are two messages considered equal?
In both test programs, what we use is the one-byte message contents. For mailboxes, we assume that the mb_send function adds to each such contents a globally unique number. This implies that each invocation of mb_send conceptually generates a unique message. Put differently, even if the same byte of contents is used in two invocations of mb_send, two different messages are sent. By symmetry, the mb_recv operation conceptually stripes this unique number off. This means that when comparing the messages in RB, we actually compare only their contents. This is perfectly in line with our goal for broadcasting: we want the contents to be delivered to all the processes, not a particular message with these contents.
I am not allowed to modify *-test.pml files but want to test the properties with different values of NUM_PROCESSES. How to do this?
Pass option -DNUM_PROCESSES=<N> to spin.
What can I assume about the “different access patterns” for mailboxes under which their operation must remain correct?
You may assume that nobody modifies your internal structures used to represent mailboxes. In other words, the only way to access mailboxes is the two functions. For the three properties, you should not assume anything about the invocation patterns of these functions, though.
When is a process considered correct and when faulty?
A process is considered faulty if it ever enters an appropriate failure state; otherwise, it is considered correct.

Copyright © 2016, Konrad Iwanicki Last modified: 25/03/2016