A Software Verification Project
I just finished teaching Utah's Software Verification course, and I've been thinking how to improve the final project to be more straightforward and informative. This draft is something I might assign in the future; it is intended to be completed over multiple weeks, with regular check-ins to make sure students are on track.
Overview
This assignment involves writing a verifier for a simple programming language. The verifier converts code to verification conditions, uses Z3 to test those verification conditions, and then outputs either a counterexample if necessary. The ultimate goal is to verify Quicksort, so the verification condition uses the theories of booleans, equality, integers, and arrays (including the array property fragment). The class covers solvers for all of those theories, but this assignment uses Z3 as a black box.
The language is a subset of Python. This has two benefits. First,
programs can always be run with the normal Python interpreter, so you
don't try to verify programs that are wrong. Second, Python's ast
module provides a complete Python parser; experience shows that
parsing is a major struggle for students.
In particular, our Verified Python will allow the subset of Python where:
- Each module starts with
from typing import *
followed byfrom verified import *
. - Modules contain a sequence of function definitions and then a "main
block", which is the usual
__name__
conditional. - The main block must begin with
import sys
and thenvalues = list(map(int, sys.argv[1:]))
, with no other uses ofsys
. This takes command line arguments, but solely in the form of a list of integers. - Function definitions annotate types for every variable, those types
being
int
,float
, andList[int]
. No other types are allowed. - Function bodies and the main block contain a sequence of statements. Functions do not have return values.
- Only the first line of a function may be a call to
PRE
. Only the last line may be a call toPOST
. Only first line of a while loop may be a call toINV
. - Other allowed statements are assignments, while loops, if statements, and function calls.
- Function calls can only use positional arguments, and never use
*args
. - Built-in functions are those defined in the
typing
andverified
module;len
; andprint
. Though the main block callslist
,map
, andint
, plus slicing, those cannot be used elsewhere in the program. - Boolean operators are and, or, and not, and arithmetic comparisons
- Arithmetic operators are addition, subtraction, negation, and multiplication
- Array operators are subscripting (by a single integer: no slicing), element assignment, and the
len
function - Boolean constants and numeric constants are supported
- Lambda expressions are allowed, but only inside
FORALL
calls. - No expressions besides the above, variables, and literals are allowed.
- Conditions in
while
loops are further restricted, in that they may only be an arithmetic comparison on constants and variables.
The verified
module contains:
from typing import * def PRE(cond: bool): pass def POST(cond: bool): pass def INV(cond: bool): pass def DEC(val: int): pass def FORALL(fn: Callable[[int], bool]): return True def IMP(conds: List[bool]): a = conds[:-1] b = conds[-1] return not all(a) or b
Since PRE
, POST
, and INV
, and FORALL
do nothing, it is safe to run a
Verified Python program in ordinary Python. However, these functions
are identified by Verified Python and used to generate verification
conditions.
Tasks
Here's a step-by-step list of tasks that builds up Verified Python.
Start verify.py
. The first command-line argument to verify.py
should
name a file. Open that file, use the ast
module to parse it, and print
an error naming the file if parsing fails.
After parsing, check that it strictly follows the restrictions laid
out above. If it does not, print an error. Use the lineno
and
col_offset
fields on AST nodes to indicate the problem.
Check function bodies to make sure that none assign to PRE
, POST
, or
INV
, or to List
(used for annotating argument types). That would lead
to weird and tricky code. Also check that the functions aren't called
that. Do the same for map
, list
, and int
, which are used in the main block.
Check to ensure that all variable used are defined earlier in that same function. We will have no global variables. Check that all function calls are to functions defined in the file (or are built-ins). Also check that no variable (either assigned or used) or function names begin with an underscore. We'll use them for temporary variables.
Check to make sure PRE
, POST
, and INV
are never called except as a
statement. Check that IMP
and FORALL
are never called except inside
PRE
, POST
, or INV
.
After all checks, use compile
to compile the AST and run it. Delete
the first command-line argument, so that the code, when run, accesses
the second and onward command line arguments, like it would be if run
directly by Python
. You can test later steps by running files under
both Verified Python and normal Python.
Modify expressions to eliminate nested subexpressions. Assignments,
function calls, and conditionals contain expressions (while
loops do
as well, but those expressions do not allow nested subexpressions).
For each of those, extract subexpressions into a new variable, named
beginning with an underscores. Skip calls to PRE
, POST
, and INV
. The
behavior of programs should not change.
Implement a type checker. For assignments, it should infer the type of the variable based on the assigned value. For function calls, it should check that arguments are valid against the function's annotation.
Temporarily add a check that programs do not use conditionals, loops,
function calls (other than PRE
, POST
, and print
) or array operations
(subscripts, assignments, len
, or literals). Those features are harder
and we'll get to them later.
Write a function to convert propositions, like the contents of PRE
and
POST
-conditions, to Z3 propositions.
Generate verification conditions for function bodies, which at this
point. Write a function that takes a function definition and returns a
one-element list of Z3 propositions, with that one element
correponding to P → WP(Q)
. For each function, generate its
verification conditions, negate each one, and pass it to Z3. If Z3
returns sat
, print an error. Treat print
calls like pass
statements.
In case of failure, extract Z3's model and print each variable's value, skipping temporary variables. Also identify the function that failed.
Verify a function that computes one more than its input.
Generate verification conditions for conditionals.
Write an absolute-value function and verify that it never returns negative values.
Allow loops and INV
calls again. Generate verification conditions for
loops. Generally speaking, a function with a single loop should have
three verification conditions: one for "entry", which continues
propagating backwards through the function; and then "step" and "exit"
verification conditions, which do not propagate backwards, because
they should be universally quantified over all possible variable
states.
When verifying a loop returns sat
, identify which of its three
verification conditions failed in the error message.
Write a counter, which starts at 0 and increments until it hits an
argument n
, and verify that it returns n
.
Add support for measures with DEC
. The measure expression should be an
integer that starts, stays, and ends non-negative. Each iteration of
the loop should decrease it.
Allow function calls again. Generate verification conditions for
function calls using those functions' pre- and post-conditions.
Continue to treat calls to PRE
, POST
, INV
, DEC
, and print
specially.
For recursive calls, check that the first argument to the function is an integer and that it decreases in recursive calls.
Check that no functions are mutually recursive. That is, build the call graph, and check that it has no loops other than self-loops.
Add support for arrays and the len
function, specifically the array
created by in the main block. Arrays are tricky because they
correspond to two Z3 variables: one for the array and one for its
length. Z3 arrays are infinite-length and do not have lengths.
Add support for array subscripting. Add verification conditions to check that subscripts are in bounds.
Add support for modifying array elements. Treat the assignment a[i] =
v
as a = store(a, i, v)
.
Check that arrays are are never assigned to variables, so that there is only one reference to each array. Check that no array is passed twice to a function.
Modify the verification conditions for functions to quantify take into account that any arrays passed to the function may have been modified. To do so, universally quantify over those arrays in the exit half of the verification condition.
Write and verify a linear search function.
Write and verify quicksort.