IEEE

©2001 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

Return to Table of Contents


Emphasizing Formal Analysis In a Software Engineering Curriculum

Ann E. Kelley Sobel


Abstract - The integration of formal method application throughout a six course software engineering curriculum is outlined. Formal analysis skills were included in order to increase the complex program solving skills of the student. The five instruction-oriented courses presented highlight how formal analysis was introduced in and applied to the corresponding subject material. The materials presented, along with an accounting of the experiment, provide a basis for other academicians to teach formal analysis at their own institutions.


 

1. Introduction

     

            A project that integrated formal methods into a software engineering curriculum[1] at Miami University’s Computer Science and Systems Analysis Department is outlined.  This project endeavored to address the numerous articles in the current literature expressing the belief that the current educational program offered for a computer science degree does not stress the fundamental mathematical and engineering principles that should form the basis of software engineering [5,6,8,10,11]. A survey of computer science faculty attitudes toward formal methods found that faculty believe that formal methods are not only important, due to their disciplined approach to computer science, but are feasible to teach [9].  Among formal methods advocates, a common belief that the use of formal analysis during software development produces programs that have fewer errors is held.  Despite this evidence, few undergraduate courses exist that teach the formal analysis of software and virtually no programs exist that attempt to base several standard core curriculum courses on the understanding and application of formal analysis during the construction of software.

 It is for these reasons that an undergraduate software engineering curriculum would benefit from the integration of formal analysis.  The experimental curriculum included the previously existing core curriculum courses of data structures, object-oriented software design, software engineering, and a senior capstone course that were modified to include formal analysis application.  This curriculum also included two new courses designed to teach the skills necessary to apply formal methods; one emphasizing program derivation and the other emphasizing the analysis of concurrent programs. Students were required to have taken a discrete mathematics course at the end of their freshman year before they entered the experimental curriculum. 

Two different formal methods were used throughout the experimental curriculum.  The formal analysis of strictly sequential code followed Dijkstra’s weakest precondition semantics [3], while the analysis of concurrent programs used a trace-based operational formal method [12].  The latter formal method was chosen since it is relatively easy to learn, due to its operational nature and reliance on only prior knowledge of first-order logic.  These characteristics were particularly important given the student’s lack of software development experience and their limited exposure to only first-order logic.

This paper provides only a sampling of examples, examination questions, and programming assignments. The full accounting of all materials can be found at http://www.eas.muohio.edu/san/formal.  The evaluation of the experimental curriculum provides evidence of the benefits of undergraduate students learning to use formal analysis during software construction.  The materials presented along with an accounting of the experiment provide a basis for other academicians to teach formal analysis at their own institutions.

Evidence was gathered throughout the three-year period in order to establish the goals of the experiment; namely, undergraduate students are able to learn formal analysis techniques, the feasibility of teaching formal analysis, and the use of formal analysis increases the student’s complex problem solving skills.  In order to control as many variables as possible, the experimental and standard core courses were taught by the same instructor using a common text.  Mentors were given strict instruction to aid students only in the understanding of formal methods and its application.  Students in the standard and experimental versions of a particular course experienced the same examinations and programming assignments.  The analysis of the evidence establishes that all three goals of the experiment were met. 

An overview of the experimental curriculum is presented in section 2 and expanded in sections 3,4,5, and 7.  The basics of formal specification and program derivation are outlined in section 3.  The use of coupling invariants to establish a valid implementation of an ADT is presented in section 4.  Section 5 highlights the application of formal analysis during object oriented development.  The introduction of the trace-based specification model used in the subsequent courses appears in section 6.  The application of this specification model in the context of concurrent software systems is presented in section 7.  Section 8 discusses the senior year software development project.  The qualitative and quantitative results that assess the success of meeting the goals of the experimental curriculum are listed in section 9.   Lastly, a summary is presented in section 10. 

 


2. Overview of the Curriculum

 

The established curriculum for the department that formed the basis from which the experimental curriculum was taught is a blend of computer science, software engineering, and operations research courses.  The experimental course track included inserting formal analysis into four of the existing software engineering courses and two new courses that dealt solely with the teaching of formal methods.  The sequence of the courses and their status was as follows.

 

                                                Course Sequence and Status

                   Course Title                                                      New                 Existing

                        Logical Foundations of Programs                         *

                        Data Structures                                                                              *

                        Object-Oriented Software Design (OOD)                                      *

                        Formal Analysis of Concurrent Programs             *

                        Software Engineering                                                                     *

                        Software Engineering Senior Project                                              *    

 

The first course offered in the experimental curriculum emphasized program derivation.  Students were taught to specify programs using first-order logic and Cohen’s specification notation [2], and to represent programs using Dijkstra’s guarded command code [3].  Proofs were created manually and used both Hoare triples and the wp predicate transformer [4].  The programs examined in this class were typically quite small, usually fewer than fifteen lines of code.

The Data Structures course taught typical abstract data types (ADTs) which included lists, stack, queues, and trees.  Programming assignments were of relatively short length and emphasized the use of a particular ADT as part of the application.

The OOD course taught a variety of design methodologies (e.g., UML) and gave group projects that focused on their use. The projects are traditionally Windows-based and the size of the graphical interface code (written using Microsoft Foundation Classes) clearly dominates the total size of the application code.

The Formal Analysis of Concurrent Programs course centered on the different modes of interaction between concurrent tasks/threads/processes.  For the first time, students were faced with implementation code that produced multiple execution sequences coupled with the enforcement of synchronization.  Students applied the trace-based operational formal method to specify and verify general problems that relied on multiple processes communicating by means of remote procedure calls, shared variables, and message passing.

As part of the department’s core requirements, a student experiences two courses that cover the entire software engineering life cycle during their senior year.  The first course, Software Engineering, addresses feasibility study, prototyping, and design of a software system.  The second senior course, Software Engineering Senior Project, assigns group projects with the expectation of a demonstration of a completed prototype of a software application at the end of the course.

Any student who participated in one of the existing experimental curriculum courses had to learn both the material that was commonly taught as well as how to apply formal analysis to any software application in that course.

 


3. Logical Foundations of Programs

 

The basic skills needed to apply formal analysis were taught in a new course, Logical Foundations of Programs. This course covers mathematical induction, well-foundedness, specification, derivation, and the theory of the correctness of programs. The text of Cohen, Programming in the 1990s, was chosen because the presentation of this material was not coupled with an implemented programming language; the use of the guarded command language [3] gives the student the freedom to translate this language into a variety of contemporary programming language constructs. The students were motivated to gain experience in performing logical deduction by the assignment of Smullyan logic puzzles.

A simple example highlighting the derivation process follows. It originally appeared in an out of class quiz.  Derive the guarded command code from the following specification without using the absolute value function.

                                             var x,z: int

                                             ;  z : z = |x|

 

We must first note that the precondition for this specification is true and the postcondition is stated in the last line.  This postcondition notes that only the variable z can be modified and the program terminates with z = |x|.  Using the weakest precondition semantics (wp) [4], a statement, S, must be found that satisfies the following assertion.

true wp.S.z = |x|

 

When deriving the statement S, comments that state the justification for each line are connoted using the notation, *  < comment >.  Continuing with the derivation of S, if we choose S to be z := x, we get

                                                wp.z := x.z = |x|

                                          *  < definition of := >

                                          x = |x|

                                          *  < definition of absolute value >

                                          x 0

 

as the precondition which is not equivalent to true.  If we choose S to be z := -x, we get

                                                wp. z := -x. z = |x|

                                          *  < definition of := >

                                          -x = |x|

                                          *  < definition of absolute value >

                                          x < 0

as the precondition which is not equivalent to true.  However, we note that by combining these two assertions, x 0 x < 0, we get the equivalent of true.  This observation, and the fact that we started with identical postconditions in the two statement analysis above, permits the combination of the two statements into a single if statement:

if  x 0  z := x  []  x < 0  z := -x  fi

 

For a more complex example of program derivation, the following specification requires the derivation of a program that contains a loop.

                              var f: array of int {F = #f}

                              ; var s: int

                              ; s: s = (i. 0  i < F. f[i] )

 

Recognizing that this summation will probably require a loop ({P} do BS od {R}), we must determine a loop invariant (P) and a guard (B).  Using the technique of replacing a constant by a fresh variable, the postcondition can be rewritten as follows.

s = (i. 0  i < n. f[i] )    0  n F     n = F

 

The first two clauses will be used for the loop invariant and the modified last clause, n  F, will represent the guard.  The postcondition is partitioned in this manner so that P  B   R, which guarantees that the postcondition (R) holds upon termination.  Next, the loop invariant must be shown to hold before execution of the loop begins. This requires finding a Q so that

                                                Q   wp. So . P

 

Clearly, So must contain some assignment of s and n.  The weakest precondition semantics can be used to calculate the necessary values.

                              wp. s,n := E,G. s = (i. 0  i < n. f[i] )    0  n F  

                               < definition of := >

                              E = (i. 0  i < G. f[i] )    0  G F 

                         < choose G=0 to establish an empty range >

                        E = (i. 0  i < 0. f[i] )    0  0 F 

                   * < empty range, F = #f from Q, therefore F  0 >

                        E = 0

 

                        So :  s,n := 0,0

 

The final step requires the calculation of S.  Again, S must contain some assignment of s and n.

                        wp. s,n := H,I . s = (i. 0  i < n. f[i] )    0  n F

                        *  < definition of := >

                              H =  (i. 0 i < I. f[i] )    0  I F

                              * < choose I = n+1 >

                              H =  (i. 0 i < n+1. f[i] )    0  n+1 F

                              * < algebra, range split >

                              H =  (i. 0 i < n. f[i] ) +  f[n]    0  n+1 F

                              * < s = (i. 0  i < n. f[i] ) >

                              H = s + f[n]   0  n+1 F

                              * < n  F   0  n F (given in precondition), algebra >

                              H = s + f[n]

Therefore,

                              S :  s,n := s + f[n], n+1

 

Putting all the pieces together, the derived program follows.

                              { F = #f }

                              s,n := 0,0;

                              do  n  F   s,n := s + f[n], n+1  od

            { s = (i. 0  i < F. f[i] ) }

 

Classroom instruction was supplemented with several in class problem solving sessions that were supported by three undergraduate mentors. Students were evaluated by their performance on both in and out of class quizzes. In class quizzes predominantly tested their ability to perform textual substitution-based proofs while out of class quizzes extended their experience performing program derivation.  A major project in the course centered on the derivation of a guarded command program from a first-order logic specification. Students had to verify that their program did indeed satisfy the given specification. Their programs were then translated into the C++ language. One student believed he was such a strong C++ programmer that he could generate a “better”' C++ program without using the guarded command program. Ironically, his C++ solution had several errors whereas his guarded command program had none.

 


4. Data Structures

 

            The Data Structures course applied the formal analysis skills gained in the prior program derivation course to the development of abstract data types (ADT).  ADT specifications were written using sequences of operations performed and the corresponding effect of performing those operations on an instance of the ADT.  Correspondence between ADT operations and a sequence of implementation type operations were maintained using coupling invariants.  Many implementations commonly used by software engineers, e.g., a list implemented using an array, were proven to correspond to the ADT.  Both the instructor and the students were surprised to learn that the average length of proof for the correspondence of just one ADT operation was one and one half pages of hand written text.  However, after completing such an exercise, there was no question as to the correctness of the implementation of a particular ADT.

Before introducing an example of establishing that a particular implementation satisfies the semantics of an ADT, the necessary sequence operations are given.  For these definitions, s is a given sequence of type T (s  T*).

 

Operation

Definition

#s

Length of s

s ^ x

x is concatenated to s where xT

s[k]

kth element of s (zero-based)

s/<i,j>

restriction of the elements of s to the ith through jth element

s-

#s = 1 

#s > 1  s/<0,...,s[#s-2]>

 

 

            The following example illustrates the steps of establishing the correspondence between the operations for a stack and a list (sequence) based implementation of a stack.  The axioms defining the semantics of a stack follow where T is any type.

 

                                                   initialize  empty                                           (A1)

                              x  T.  push[x]     empty    top = x                        (A2)

                                             *empty  pop                                           (A3)

                           x  T.  push[x]; pop    skip      // no op                       (A4)

 

The sequence based definitions for the referenced stack operations in the group of axioms are given as follows.

 

                           st    T* :=  < >   // st is a sequence of T values that is initialized to < >

                           push[x]     x  T    st  =  st  ^  x

                           empty       # st  =  0

                           top            empty    st[#st-1]

                           pop            empty    st  =  st-

 

Some of these definitions use of the “tick” notation, e.g., st’, to refer to the value of the sequence st after a particular sequence operation has been applied.

To demonstrate that the list-based implementation maintains the semantics of the push axiom given above (A2), one must prove:

                           st =  st  ^  x       #st 0    st’[#st’–1] = x  where x  T

 

which is the original push axiom with each stack operation replaced by its list implementation. Suppressing the details of the proof, one must establish the following first and second facts, (1) and (2) respectively, in order to verify the first and second clauses found on the right hand side of the implication.

(1)  #st  =  #st  +  1      #st   N           #st 0

(2)  st’ = st ^ x   st’/<0,…,#st’-2> = st     st’[#st’-1] = x    where x  T

 

Continuing with the pop axiom (A3) and substituting each stack operation with its corresponding list implementation, we can demonstrate that the list based implementation maintains the semantics of the pop axiom by proving:

#st  0     st-

  

which is true since st- is defined for #st0.

Lastly, to demonstrate that the list based implementation maintains the semantics of A4, one must prove:

( st’ = st ^ x;  st = st- )    st

 

Establishing the following two facts allow the conclusion that the left hand side of the equivalence is indeed st.

(1)    #st 0   st’/<0,…,#st’-2> = st      st’[#st’-1] = x 

(2)    #st 0      st-  =  st’/<0,…,#st’-2>

 

All material covering the axiomatic semantics of ADTs and the use of coupling invariants had to be generated by the lecturer. Unfortunately, the formal methods students experienced less lecture time on the standard data abstraction material so that time could be spent on writing specifications. At times, we were forced to meet outside of regularly scheduled class time.  Some students found this pace grueling and made their frustrations known. Evaluations consisted of take home quizzes that centered on coupling invariants, in class programming problems, and several out of class programming assignments that were based on standard applications of ADTs.

 


5.  Object-Oriented Software Design

 

For the first time, students were challenged to write the formal specification of a medium-sized programming problem before writing any line of implementation code. Certainly, they already had a full year of experience doing such a task but never on a sizable problem. Students found it difficult to logically analyze nontrivial problems before writing a single line of code.   After gaining more experience, they discovered that if one thoroughly analyzed the problem to a point that they could write the specification of it, the problem was indeed well understood by the developer.

A major project that was assigned during the course was the development of an elevator controller.  This assignment was chosen mainly for its familiarity to most people, but it was also selected for comparison to several existing specifications of the same problem. It is included here to demonstrate the size and complexity of the programming problems that the students could formally analyze.  The informal requirements consisted of a list of expected behaviors of the elevator. 

 

1.      Maintain a set of buttons that have been pushed inside the elevator (requested floor number)

2.      Maintain a set of buttons that have been pushed outside the elevator (floor number at which passenger is waiting and requested direction)

3.   Halt when elevator has no more requests to service

4.   Service all outside requests eventually with all floors given equal priority

5.      Service all inside requests eventually with all floors being serviced sequentially in the direction of travel.

 

The specification of the elevator scheduler follows.  In words, the elevator’s new direction of movement is based on whether there exists a reason to continue in its current direction.  The data structures used to maintain the current state of the elevator are the current floor (CurFloor), current direction of travel (CurDir), the set of requests internal to the elevator (InsideReqSet), and the set of requests external to the elevator (OutsideReqSet). The two predicates, ReasonToGo and PickupDropoff, evaluate whether such a reason exists.   ReasonToGo determines whether it is necessary for the elevator to continue its movement in the current direction and PickupDropoff determines the necessary changes to the inside and outside request sets as passengers are picked up and dropped off at different floors throughout the building. 

 

var CurFloor : int

;var CurDir : direction {CD = CurDir}

;var InsideReqSet : set of int {I = InsideReqSet}

;var OutsideReqSet : set of OutsideReq{O = OutsideReqSet}

;CurDir, InsideReqSet, OutsideReqSet:  ( CD = HALT

 (ReasonToGo(UP)  CurDir=UP  PickupDropoff(UP))

      *  (ReasonToGo(DOWN)  CurDir=DOWN  PickupDropoff(DOWN))

      * ((ReasonToGo(UP) *ReasonToGo(DOWN)) CurDir=HALT) )

*( CDHALT

*  (ReasonToGo(CD)CurDir=CDPickupDropoff(CD))

* (            ReasonToGo(CD) ReasonToGo(Rev(CD)) CurDir=Rev(CD)

            * PickupDropoff(Rev(CD)))

      *((ReasonToGo(CD) *ReasonToGo(Rev(CD))) *CurDir=HALT

            * PickupDropoff(HALT)) )

where

            Direction                                                                           {UP,DOWN,HALT}

 

Rev(Dir)                                                                            UP             if  Dir=DOWN

                                                                                          DOWN                   if  Dir=UP

                                                                                          HALT                    if  Dir=HALT

 

OutsideReq                  (floor:int, dir:Direction)

 

ReasonToGo(Dir)        (i.0I<#I. (I.i > CurFloor  Dir=UP)

                                          *        (I.i < CurFloor   Dir=DOWN))

                                    *(o.0o<#O. (O.o.floor > CurFloor Dir=UP)

*(O.o.floor < CurFloor Dir=DOWN)

*(O.o.floor = CurFloor  Dir=O.o.Dir))

 

            PickupDropoff(Dir)    ((CurFloor,Dir)O  CurFloorI)

                                                *(CurFloor I  InsideReqSet=I-CurFloor)

                                                *((CurFloor,Dir)O

                                                      *(CurFloorI  InsideReqSet=I+GetDests()

                                                                  *OutsideReqSet=O-(CurFloor,Dir))

                                                      *(CurFloor I

                                                            *InsideReqSet=I+GetDests()-CurFloor

                                                            *OutsideReqSet=O-(CurFloor,Dir)))

 

GetDests()                   A user interface function. This function returns the set of buttons just pushed inside the elevator before the elevator door closes.       

     

The variables used in this specification are noted using the keyword, var.  The braced predicates following their declarations represent the precondition that the variables’ initial values must satisfy.  In the line after the variable declarations, those variables that are permitted to change are listed before the colon which is the symbol denoting the start of the specification.  The details of this specification, as well as the full verification of the guarded command code, can be found in [1].  

The original draft of the specification required approximately five hours to write.  Revisions of the specification were required as errors were discovered through a process of test case analysis, as well as during the proof of the guarded command code.  The errors were generally due to the specification being incomplete.  For example, one error resulted from not specifying how the elevator should resume motion after halting.  This error was discovered when considering a test case.  Another error was discovered when the proof led to the generation of an abort statement.  Revising the specification to account for errors required approximately ten additional hours, some of which was used for generating key portions of the proof.  All errors in the specification were discovered before implementing code - that is, the specification never required revision due to errors discovered during actual code testing.  The guarded command code was translated into C++ and no errors were found in the implementation to date.

 


6.  Trace-Based Operational Formal Method

           

The trace-based operational formal method was first introduced in the OOD course and extensively used in the concurrent programs course so it is introduced here.  Only those components of the specification model that are necessary to understand the specification tables that are presented through the remainder of this work are provided in this section.  The detailed components of both the specification and verification model can be found in [12].

The specification of a process/thread/task consists of a collection of the externally visible “action” specifications of the process.  These externally visible actions constitute the process’ externally visible behavior.  An action represents an interaction between this process and the other processes of the concurrent program.  The occurrence of each action is recorded on a trace that is associated with the initiating process.  The specification of an action describes when that action may occur in terms of the current value of the process trace and the effect of the occurrence of this action on the process trace by the addition of an element(s) to the trace sequence

 

6.1 Traces Operations and Functions

 

The variable h is used to represent the process trace which is a zero-based sequence.  This sequence is represented as a list of elements delineated by angle brackets.  A subscript i, e.g., hi, is used to identify the process with which this trace sequence is associated.  Generally, process trace sequences are initialized to the empty sequence .

There are a number of trace functions that simplify the writing of predicates and use descriptive names to identify particular sequence element components instead of the unintuitive sequence component selection syntax, h[i][1].   The trace operations and convenient trace functions are defined below.

 

Notation

Definition

h[k]

kth element of h (zero-based)

h’ ^= h

h’ = h’  ^  h

h’  h

h’ is an initial subsequence of h

hr

reverse of h

H/i

restriction of the elements of h to those elements whose process id component (defined below) is I

Type(h[i])

h[i][0]{ !, ? }      // send or receive

Action(h[i])

h[i][1]    // action component of the ith  element

TAction(h[i])

h[i][0] ^ h[i][1]    // type and action component

PID(h[i])

h[i][2]     // process id component of the ith element

CallR(h[i])

h[i][3]     // process id of the receiver of the ith element

Val(h[i])

h[i][4]     // data component

VSeq(h)

*                                   if  h=< >

VSeq(h-)^Val(hr[0])      otherwise

SVSeq(h,a,b,c)

                                   if  h=< >

SVSeq(h-,a,b,c) ^ hr[0]   if  Type(hr[i])=a   Action(hr[i])=b

            Val(hr[i])=c

SVSeq(h-,a,b,c)              otherwise

 

 6.2  Process Specifications

 

Informally, an individual process execution is represented by a sequence of actions that are recorded on that process’ trace sequence.  Typically, actions are some form of communication: synchronous message passing or shared variable communication.  An action is defined in terms of a change of value of the corresponding process trace.  A specification of an action, which includes the current and extended process trace, will be written in two parts: an enabling and an effect.  The effect specifies the change in value of the process trace by concatenating element(s) to the trace.  The enabling part specifies when this action may occur as a guard to the trace update.

To write the specification of a process, the actions that comprise the behavior of the process are determined.  For each action, all possible enabling conditions and their effect on the process trace are listed.  Tabular notation is used to represent the process specification.   Each table is a visual representation of the potential changes of a process trace.  Specification tables have the following form.

 

Pi

Action

enable

Effect

 

The enable condition is typically written in terms of the last element of the process trace sequence at any point in time, namely hri[0].  Therefore, an individual element listed as the enable condition means that the last element of the current sequence is asserted to be of this form hri[0]=enablej.  Similarly, the effect lists the element(s) added to the process trace sequence for a particular action, namely, hi ^= effectj.  Each process participates in an initial action that initializes a process’s trace sequence.  This action will be omitted from the table when the process trace sequence is initialized to . 

A sequence element either has the form, <!, A, j, k, X> or <?, A, j, X>, where the first component represents the type of communication (either receive (?) or send (!)), A represents the externally visible action, and j names the process with which this element is associated.  The k component names the process that either receives (?) or sends (!) this message, if known.  If the receiver/sender is unknown or at a single, universally known location (e.g. a shared variable), then this field is listed as .  Lastly, X represents the sequence of data sent or received.  If X is a data sequence of length one, then the enclosing angle brackets are omitted.  If X is listed as , then it is the empty data sequence.  To conserve table space, the j component is omitted from the elements when its value is readily apparent.  For example, using the previous specification table, the j component is omitted because it is known to have the value i which is given in the name of the table.

The notation, Pi sat Ti, indicates that Pi’s process trace sequence values satisfy the predicate ti composed of Pi’s specification table entries (Ti) at any point during the execution of Pi.  The predicate ti is generally constructed as follows: j  hri[0]=enablej  hi ^= effectj  where j ranges over the rows of process Pi’s specification table, Ti.

Given the following specification table for a process Pi which receives a value from another process in the program and forwards that value to the process Pk :

 

Pi

Receive

Send

hi =   hri[0]=<!,Send, k, X>

<?, Receive,, X>

 

<?, Receive, , X>

 

<!, Send, k, , X>

 

The corresponding predicate of the specification table is as follows.

    (hi = hri[0]=<!, Send, k, X>)   hi  ^=  <?, Receive, ,X>   

 hri[0] = <?, Receive, ,X>  hi ^= <!, Send, k, X>

 


7.  Formal Analysis of Concurrent Programs

 

Throughout the course, students used the specification model defined in the trace-based operational formal method to capture the formal specification of standard programming problems.  Students wrestled with the complexities of the order of execution of parallel programs and determining the necessary conditions to enforce synchronization without causing deadlock.  Students already had experience with the concepts of nondeterminancy and fairness when studying sequential code, but these concepts were considerably more complex for concurrent programs.  It became blatantly clear that the complexity of specification and especially of verification had substantially increased.  An entire class period could be spent verifying a small problem, however, these sessions were still extremely beneficial in that students could actively participate in the verification process.

Students were evaluated using multiple out of class quizzes; several of which were used to build towards a full problem solution. Due to the complexity of the material, the final examination was also given as a take home assignment. Unfortunately, students were unable to execute many of the problems.  However, we were able to execute any remote procedure call assignment using a recently obtained Ada compiler.  

The following three subsections outline the application of the specification model to the various forms of synchronization found in concurrent systems: remote procedure call, shared variables, and synchronous message passing.

 

7.1  Remote Procedure Call

 

The semantics of a “rendezvous” center on the suspension of the calling process until the call is both accepted and completed.   Afterwards, both processes resume their executions.  In this situation, each action causes the addition of two elements to the process sequence, representing both the acceptance and the completion of the call.

The example chosen for the introduction of this type of synchronization between processes was a bounded queue.  The concurrent program consists of three processes, one that sends data values to the queue (Ps), one that represents the queue data structure (Pq), and one that receives the data values from the queue (Pr).  The maximum size of the queue is max.  The specification of the program must ensure that VSeq(SSeq(hr,!,Get))  VSeq(SSeq(hq,?,Put))  VSeq(SSeq(hs,?,Put)); the items placed in the queue are removed in the same order from the queue.  The specification of the queue process follows.

 

  Pq

Put

Get

R

<?,Put,,x> ^ <!,Put,,>

 

T

 

<?,Get,,> ^ <!,Get,, z >

   where

                        x is any datum 

                        z  =  Val( SSeq(hq,?,Put)[#SSeq(hq,!,Get)-1] )

                        R   #SSeq(hq,!,Put)  -  #SSeq(hq,!,Get)  <  max

                        T   #SSeq(hq,!,Put)  -  #SSeq(hq,!,Get)  >  0

 

The functions defining the value for z determine that the appropriate data item inserted (as defined in the program specification) is returned.  In contrast to the classroom data base example, R ensures that room exists before the action Put is accepted and T ensures that there exists at least one element in the queue before the action Get is accepted.

 

A possible Ada implementation for the queue follows.

            task body queue is

            q: array(0..max-1) of ELEM;

            cnt,inq,outq:INTEGER := 0;

            begin

                        loop

                                    select

                                                when cnt < max =>

                                                            accept Put(c:in ELEM) do

                                                                        q(inq mod max) := c;

                                                            end Put;

                                                            inq := inq+1;

cnt := cnt+1;

                                                or when cnt > 0 =>

                                                            accept Get(c:out ELEM) do

                                                                        c := q(outq mod max);

                                                            end Get;

                                                            outq := outq+1;

                                                            cnt := cnt+1;

                                                or

                                                            terminate;

            end select;

                        end loop;

            end queue;

 

7.2  Shared Variable Communication

           

The semantics of shared variable communication relies on process synchronization by means of the values of shared variables.  The shared variables are used to record information that all processes need to assess the progress made towards solving a particular problem.  Any expression can contain a reference to a shared variable.

This first example outlines the standard readers/writers problem.  The informal specification of the problem states that any number of readers may read at one time when there is no writer but only one writer may write when there are no readers.  The specification includes the use of two shared variables, <r,w>, which represent the number of readers and writers, respectively, and their access is maintained as critical regions by constructs in the implementation language.  In this solution, each process wishing to access the shared resource must first determine how the resource is currently accessed by examining the current values of <r,w> and adjusting those values accordingly if access is allowed.  In these trace elements, the symbol  is used to represent that no process is associated with the shared variables <r,w>.

 

Pi

RWRi

ReadI

Writei

TAction(hri[0])={<?,RWRi>,

<!,Readi>, <!,Writei>}

  hri[0=0

<?,RWRi,,<r,w>>

 

 

hri[0]=<?, RWRi,,<r,w>>

 Val(hri[0])[1] = 0

 

<!, Readi,,<r+1,>> ^

  <?,Readi,,<r,w>> ^

    <!, Readi,,<r-1,>>

 

hri[0]=<?, RWRi,,<r,w>>

 Val(hri[0])[1] 0

 

 

hri[0]=<?,RWRi,,<r,w>>

   Val(hri[0])[0] = 0

*   Val(hri[0])[1] = 0

 

 

<!, Writei,,<,1>> ^

  <?, Writei,,<r,w>> ^

    <!,Writei,,<,0>>

hri[0]=<?,RWRi,,<r,w>>

   Val(hri[0])[0]   0

*  Val(hri[0])[1]  0

 

 

 

 

As long as no writers exist, this process may read the shared resource (first row). When it is finished reading, then the corresponding update is made to the value of r.  If a writer exists, then the process is unable to enter its critical section.  The third row states that when no readers or writers currently exist, this process may write to the shared resource. Once completed, the value of w is updated.  Lastly, if there exists other readers or writers, then the process must wait.

Another interesting shared variable problem is determining the smallest index of a positive value in an array, c[ ].  In this program, one process will check the values of the even indices and the other checks the odd indices; both terminating once the location of the first positive value is found.  Let n = size(c[ ]).  The two shared variables will be paired as <et,ot> in the data component of the element; representing the even and odd indices, respectively.  The specification for the program is as follows.

 et, ot.  0  min(et,ot) < n.

(j. 0  j < min(et,ot). ( c[j] < 0 ) min(et,ot) < n  ==> c[min(et,ot)] > 0)

 

Pet

Initet

Contet

Poset

<!,Initet,,<n,>>

 

 

R

 

<?,Contet,,<et,ot>>

 

T

 

 

<!,Poset,,<i,>>

where

                        i  =  2 * ( #SSeq(het, ?, Contet) - 1 )

                        R    #het > 1    Type( hret[0] ) = ?   i < min( Val( hret[0] ) )

                        T    Type( hret[0] ) = ?   c[i] > 0

 

 

Pot

Initot

Contot

Posot

<!,Initot, ,<, n>>

 

 

R

 

<?,Contot,,<et,ot>>

 

T

 

 

<!,Posot,,<, j>>

where

                        j  =  2 *  #SSeq(hot, ?, Contot) - 1

                        R    #hot > 1    Type( hrot[0] ) = ?   j < min( Val( hrot[0] ) )

                        T    Type( hrot[0] ) = ?   c[j] > 0

 

In these specifications, i and j represent the first occurrence of a positive array value from examination of the even and odd indices, respectively.  Each process continues searching until either a positive value is found or the end of the array is reached (R).  If a positive value is found, the index of the positive value is written and the search is terminated (T).  The process specifications can be refined into the following guarded command code.

 

Pet ::  et := n;                                                     Pot ::  ot := n;

           i := 0;                                                                 j := 1;

           do  i  <  min(et,ot)                                        do  j  <  min(et,ot)

                        if  c[i] > 0    et := i;                                      if  c[j] > 0    ot := j;

                        []   c[i] <= 0    i := i + 2;                              ||   c[j] <= 0    j := j + 2;

                        fi                                                                      fi

           od                                                                      od

 

7.3  Synchronous Message Passing

 

The presentation of the semantics of synchronous message passing centers on the language communicating sequential processes (CSP) [7].  In CSP, processes that exchange messages name either the sending or receiving processes in their input/output statements.  Processes can suspend execution while waiting for their communication partner to participate in the exchange.  The semantics of this language is further complicated by the notion of distributed termination; processes can terminate if they are suspended at an I/O guard waiting for a message exchange from a process or processes that has/have already terminated.  Therefore, all CSP process specifications will include the action of termination, Term, which is represented in the data component of an element by .  Since the action of termination is communicated to all processes within the program, the component representing the identity of the receiver is .

The first example was chosen to carefully avoid the inclusion of distributed termination.  Students were asked to write the specification of a process that performs the summation of natural numbers. The sum is reported back to the requesting process upon receipt of a negative number.

 

Psum

Receive

Report

Term

hrsum[0]=<?,Receive,Pj,y>y0

 hrsum=0

<?,Receive,Pj,x>

 

 

hrsum[0]=<?,Receive,Pj,y> y<0

 hrsum=0

 

<!,Report, Pj,z>

 

<!, Report, Pj, z>

 

 

<!,Term,,>

where

                        x  and y            any data values

                        z                       i .  0  i  < #hsum .  Type( hsum[i] )=?. Val( hsum[i] )

 

 

The specification was then used to derive the following CSP code.

                                    Psum  ::  sum := 0;

                                                   P ? x;

                                                   do  x > 0  sum := sum + x;

                                                                        P ? x;

                                                   od

                                                   P ! sum 

 

Another interesting problem centers on the rearrangement of the contents of two sets so that one of the resulting two sets contains the smaller elements and the other contains the larger elements.  More formally, the specification of the set partitioning problem is as follows:  Given two disjoint sets, S and T,

ST =    S’T’ =     ( ST  =  S’ T’)     max(S’) < min(T’)

 

Where S’ and T’ represent the two sets after the execution of the set partitioning program terminates.  The algorithm consists of two processes exchanging the largest value from S with the smallest value from T until the same value is exchanged in both directions.  The specifications of the two processes, Ps and Pt, follow.

 

 

Ps

Sends

Receives

Terms

hhhrs[0]

{<?,Receives,Pt,x>}

<!,Sends,Pt,mx>

 

 

<!,Sends,Pt,mx>

 

<?,Receives,Pt,x>

 

R

 

 

<!,Terms,,>

where

                        x                      any datum

                        mx                     max(g(S,hs))

                        R                       #hs > 1    Val( hrs[0] )  mx

                        g(Y,h)              g(Y,h-)             if #h>0  Val( hrs[0] )=

                                                g(Y - {Val(hr[1])} + {Val(hr[0])}, h/<0,…,#h-3>)

                                                            if  #h2  Val(hr[0])

                                                Y                     otherwise

 

 

Pt

Receivet

Sendt

Termt

hhhrt[0]

{<!,Sendt,Ps,mn>}

<?,Receivet,Ps,y>

 

 

<?,Receivet,Ps,y>

 

<!,Sendt,Ps,mn>

 

<?,Termt,Ps,>

 

 

<!,Termt,,>

where

                        y                      any datum

                        mn                    min(w(T,ht))

                        w(X,h)             w(X,h-)            if #h>0  Val( hrt[0] )=

                                                w(X + {Val(hr[1])} – {Val(hr[0])}, h/<0,…,#h-3>)

                                                X                     otherwise

 

The two functions, g(S,hS) and w(T,ht), are used to determine the current values of the two sets S and T.  The process Ps continues to send the maximum set value as long as the last value received is smaller than the current maximum set value (R).  One should note that the process Pt only terminates when it receives the notification that Ps has terminated.  The refinement of the two specifications results in the following CSP code.

 

Ps  ::  mx := max(S);  Pt ! mx;                            Pt  ::  do  Ps ? y     T := T  {y};

           S := S - {mx};  Pt ? x;                                                              mn := min(T);

           S := S  {x}; mx := max(S);                                                   Ps ! mn;

           do  mx > x    Pt ! mx;                                                           T := T - {mn};

                                      S := S - {mx}; Pt ? x;            od

                                      S := S  {x};

                                      mx := max(S)

           od

 


8. Software Engineering Principles and Their Application

 

The formal methods students were challenged to apply their formal analysis skills to a safety-critical industrial application in their senior year. Their first senior course addressed the application of formal analysis during the requirements analysis and requirements definition phase of the software engineering life cycle and their second senior course extensively tested their ability to not only write formal specifications but to verify the resulting design code (written in guarded command language) of an industrial product.

It was important to choose a final senior project that possessed the requirement of being “fail safe” since it is this category of software that necessitates the type of analysis that a formal method provides. The project chosen was a software simulation of the Return To Launch Site (RTLS) scenario of the Command & Control software subsystem of the Space Shuttle for one of NASA’s Independent Verification & Validation centers.  The formal methods students wrote the specifications of the subtasks of this system and derived/verified the resulting guarded command code. The end result of this project consisted of a document that includes over 800 lines of guarded command code, 1200 lines of specification, and 4000 lines of verification proofs.  The formal methods students were able to identify numerous instances of inconsistency and incompleteness within the NASA document.   

The students began this project at the end of the first senior course by analyzing the requirements specification document for the RTLS.  The majority of these specifications were written in English and relied on line indentation to signify groupings of tasks.  The first stage of the project centered on writing first-order logic specifications of the main tasks of the RTLS.  It immediately became obvious that the specifications would be exceptionally lengthy and difficult to write. But despite their length, the specifications were still not sufficiently strong to use the weakest precondition semantics to derive the code.  The majority of these specifications were of the following form:

if  x  > 2, set  y  equal  to  1

 

which would most naturally have as a specification, “ x > 2  y = 1” which is not sufficiently strong to derive the appropriate guarded command code.  This is true because the code, y := 1, is a valid S using wp and this specification as the postcondition; however, the assignment should only be made if x is indeed greater than 2.  To correct this problem, one must write the specification to include additional clauses that capture the state when variables do not change as well as when they do.  The above specification must be written as:

( x > 2   y = 1 )     ( (x >2)   y = Y ),  where  y = Y is the precondition

 

For this reason, the decision was made to abandon program derivation, write the guarded command code independently of the formal specifications and then use verification proofs to show correctness.

Once the guarded command code was generated, the students began to prove this large program by proving statements individually, and then proving increasingly larger compositions of statements, continuing this process until the entire program has been proved.  Students began this approach using the assumption that modularity is desirable: a particular fragment of code might exist in several places and its proof could simply be inserted each time it is needed.  The problem with this approach is that it ignores postconditions.  It is not enough to prove that program S satisfies its own postcondition when S is a component of a larger program.  S must actually satisfy a larger postcondition which includes clauses requiring that prior proofs may not be invalidated.  Therefore, the students found they could reduce the number of proofs by starting with the program’s postcondition when proving the individual program statements.

 


9. Empirical Results of the Educational Experiment

 

            The group of students that completed the experimental curriculum consisted of eight seniors from an original group of fifteen.  The original group volunteered from a pool of sixty-five freshmen. Twenty-five percent of the completing students were women, which is a higher percentage than for our majors. In the first two years, four students reluctantly abandoned the experimental curriculum in order to co-op or continue their studies abroad which prevented the student from taking all six of the required, consecutive courses. Two students found the time commitment of learning the additional material too great.  Only one student left the program due to a dislike of the material.

These students possessed no more training or education than our general undergraduate major student, but they did volunteer to strengthen their complex problem solving skills.  A comparison of the group’s average ACT score with the remaining major’s average ACT score showed that the two populations were roughly equivalent (difference of only 1%).  It was particularly important that these two groups be equivalent in ability since the experimental group could not be randomly chosen from a large population. 

The formal methods students performed an average of sixteen percent higher on examinations in the Data Structures course, an average of eight percent higher in the OOD course, and an average of four percent higher in the first of the two senior courses.  In all cases, the two groups performed the same on their final examinations. 

 A detailed, statistical comparison was made between a programming assignment given to both the formal methods and our general students during their OOD class and it appears in  [1].  It is, however, worth repeating that all of the formal methods student’s projects passed all test cases and only five of the eleven (45.5%) non-formal method student’s projects passed all test cases.   

Clinical interviews attempted to capture a student’s problem solving process by recording the verbalization of a student’s attempt at solving a posed problem.  In clinical interviews, most students in both groups were able to give solutions to known problems. However, the general students tended to give solutions in terms of code statements and used general terms incorrectly. Approximately one quarter of the general students were able to solve unknown problems whereas almost all the formal method students were able to give general, conceptual solutions to the same set of unknown problems.

A general analytic exam was given to all senior majors in order to compare the two group’s analytic skills when solving computer science problems.  The questions were randomly selected from sample test questions in the analytic portion of the GRE examination. The test was administered anonymously on a single day in the fourth year classes.  Participating students were cautioned to take the exam only once and to not discuss its contents.  The formal methods students answered thirty-six percent more questions correctly than the general senior major.

 


10. Summary

 

An overview of the materials taught in an experimental curriculum that emphasized the application of formal analysis was presented.  The goals of the experiment were met; namely, it is possible to teach formal methods to undergraduates and the skill of formal analysis benefits the student by increasing their complex problem solving ability.  The approach, materials, and experiences are presented here so that other academicians are given a basis for generating their own courses emphasizing formal analysis. 

As the reader may have already concluded, this self-selected group came to this experiment of their own free will, which suggests they had some interest in first-order logic.  Even these students who were inherently motivated had to maintain high levels of dedication to master this material.  It became clear after the Data Structures course that the students needed additional motivation from the instructor to see why this path was chosen to learn formal analysis skills.

The instructor must face at least two challenges when teaching this type of material.  The first centers on a student’s eagerness to apply his knowledge to “real-world” problems. It is difficult to maintain motivation when students are working on specifying their nth relatively small, independent function.  Their eagerness led to the selection of the Space Shuttle problem, which gave students a virtually infinite amount of motivation (as evidenced by the creation of a roughly five hundred page document containing thousands of lines of analysis). 

The other challenge to the instructor is common to the teaching of any new subject; few, if any, textbooks exist. Instructors must generate substantial portions of the material presented.  This activity not only requires a large time commitment, but leads to an untried delivery of the material. It is important to periodically remind students that exploring new and relatively unpublished subject matter bears the necessary burden of unpolished lectures.

The following is a quotation taken from surveys given to the formal methods students several months after graduation. It appears here as motivation and encouragement for other academicians to embrace the challenge of enriching their student’s educational experience.

Facts and figures may never be of use in real life and forgotten quickly, but the ability to analyze difficult problems will be applicable throughout life and always be remembered.”

 


11. References

 

[1]  Clarkson, M.R. and A.E.K. Sobel. “Formal Methods Application:  An Empirical Tale of Software Development”, conditionally accepted to IEEE Trans. On Software Engineering, 2000.

 

[2]  Cohen, E., Programming in the 1990s:  An Introduction to the Calculation of Programs.  Springer-Verlag, 1990.

 

[3]   Dijkstra, E.W., A Discipline of Programming, Prentice Hall, 1976.

 

[4]  Dijkstra, E.W. and C.S. Skholten, Predicate Calculus and Program Semantics, Springer-Verlag, 1990.

 

[5]  Garlan, D., “Making Formal Methods Education Effective for Professional Software Engineers”, Information and Software Technology, Elsevier Science, Vol. 37, No. 5-6, May 1995, pp. 261-268.

 

[6] Gries, D., “The Need for Education In Useful Formal Logic”, in “An Invitation to Formal Methods”, Computer, Vol. 29, No. 4, April 1996, pp. 29-30.

 

[7] Hoare, C.A.R., “Communicating Sequential Processes”, CACM, Vol.  21, No. 8, 1978.

 

[8] Lutz, M., “Formal Methods and the Engineering Paradigm”, Proceedings of the Conference on Software Engineering Education”, Springer-Verlag, October 1992, pp. 121-130.

 

[9] Palmer, T.V. and J.C. Pleasant, "Attitudes Toward the Teaching of Formal Methods of Software Development in the Undergraduate Computer Science Curriculum:  A Survey", SIGCSE Bulletin , Vol. 27, No. 3, September 1995, pp. 53-59.

 

 [10] Parnas, D.L., “'Formal Methods' Technology Transfer Will Fail”, Journal of Systems and Software, Vol. 40, No. 3, March 1998, pp. 195-198.

 

 [11] Saiedian, H., “Towards More Formalism in Software Engineering Education”, SIGCSE Bulletin, Vol. 25, No. 1, March 1993, pp. 193-197.

 

 [12] Sobel, A.E.K., “The Model and Methodology of a Traced-Based Operational Formal Method”, submitted to IEEE Trans. On Software Engineering, 1999.

 



[1] CDA-952227 from the National Science Foundation’s CISE Educational Innovation Program


Return to Table of Contents