Pavel Panchekha

By

Share under CC-BY-SA.

A Software Verification Project

I just finished teaching Utah's Software Verification course, and I've been thinking how to improve the final project to be more straightforward and informative. This draft is something I might assign in the future; it is intended to be completed over multiple weeks, with regular check-ins to make sure students are on track.

Overview

This assignment involves writing a verifier for a simple programming language. The verifier converts code to verification conditions, uses Z3 to test those verification conditions, and then outputs either a counterexample if necessary. The ultimate goal is to verify Quicksort, so the verification condition uses the theories of booleans, equality, integers, and arrays (including the array property fragment). The class covers solvers for all of those theories, but this assignment uses Z3 as a black box.

The language is a subset of Python. This has two benefits. First, programs can always be run with the normal Python interpreter, so you don't try to verify programs that are wrong. Second, Python's ast module provides a complete Python parser; experience shows that parsing is a major struggle for students.

In particular, our Verified Python will allow the subset of Python where:

  • Each module starts with from verified import *. These define some constants so that Verified Python programs are runnable as normal Python.
  • Modules contain a sequence of function definitions and then a "main block", which is the usual __name__ conditional.
  • The main block is exactly three lines, verbatim. The first line is import sys. The second line is values = list(map(int, sys.argv[1:])). The third line is print(main(values)). In other words, a Verified Python program takes a list of integers on the command line, runs a main function, and prints the results. This is pretty limited but enough to test, say, sorting algorithms. This main block is the only use of slicing, sys, attributes, map, and so on. Since this block is always the same, it doesn't need to be type-checked.
  • Function definitions annotate types for every variable, those types being int, bool, and list[int]. No other types are allowed anywhere in Verified Python, except of course the main block.
  • The user may define any number of functions, but they definitely need to define one function named main with signature list[int] -> list[int].
  • Function bodies contain a sequence of statements. Only the last statement may be a return, and that statement can't be inside a condiitonal or a loop or anything like that.
  • The first line of a function may be a call to PRE. The last line may be a call to POST. Only first line of a while loop may be a call to INV. Only the last line of a while loop may be a call to DEC. If PRE, POST, or INV are omitted, they are implicitly True. If DEC is omitted it is implicitly 0.
  • Other allowed statements are assignments, while loops, if statements, and function calls.
  • Function calls can only use positional arguments, and never use *args.
  • Built-in functions are those defined in verified module plus len. Calls to print are allowed as statements, but not as expressions. Though the main block uses other operations those operations are not legal anywhere else in the program.
  • Boolean operators are and, or, and not, and arithmetic comparisons
  • Arithmetic operators are addition, subtraction, negation, and multiplication
  • Array operators are subscripting (by a single integer: no slicing), element assignment, and the len function
  • Boolean constants and numeric constants are supported
  • Inside PRE, POST, and INV, two more constructs are allowed: <= comparison on booleans (it means implication) and the FORALL(lambda x: ...) construct, which quantifies over integers. Additionally, inside POST the RETURN constant is allowed, which represents the return value of the function.
  • Conditions in while loops are further restricted, in that they may only be a single arithmetic comparison on constants and variables.
  • No expressions besides the above, variables, and literals are allowed.

The verified module contains:

from typing import *

def PRE(cond: bool): pass
def POST(cond: bool): pass
def INV(cond: bool): pass
def DEC(val: int): pass
RETURN = None

def FORALL(fn: Callable[[int], bool]):
    return True

Since PRE, POST, INV, DEC, and FORALL do nothing, it is safe to run a Verified Python program in ordinary Python. However, these functions are identified by Verified Python and used to generate verification conditions.

Here's an example function in Verified Function; it computes the square of a number in a very inefficient way:

from typing import *
from verified import *

def square(x : int) -> int:
    PRE(x >= 0)
    s = 0
    i = 0
    while i < x:
        INV(x >= 0 and 0 <= i and i <= x and s == i * x)
        j = 0
        while j < x:
            INV(x >= 0 and 0 <= j and j <= x and s == i * x + j)
            s = s + 1
            j = j + 1
            DEC(x - j)
        i = i + 1
        DEC(x - i)
    return s
    POST(RETURN == x*x)

def abs(x : int) -> int:
    y = x
    if x <= 0:
        y = -x
    return y
    POST((RETURN == x or RETURN == -x) and RETURN >= 0)

def main(values : list[int]) -> list[int]:
    PRE(len(x) >= 1)
    x = values[0]
    xx = square(abs(x))
    return xx
    POST(RETURN == values[0] * values[0])

if __name__ == "__main__":
    import sys
    values = list(map(int, sys.argv[1:]))
    print(main(values))

Tasks

Here's a step-by-step list of tasks that builds up Verified Python.

Start verify.py. The first command-line argument to verify.py should name a file. Open that file, use the ast module to parse it, and print an error naming the file if parsing fails.

After parsing, check that it strictly follows the syntactic restrictions laid out above. If it does not, print an error. Use the lineno and col_offset fields on AST nodes to indicate the problem.

Check function bodies to make sure that none assign to PRE, POST, or INV, or to list or int or bool (used for argument types), or to len. That would lead to weird and tricky code. Also check that no user functions are called any of those names, or map, list, and int, which are used in the main block. Also check that there is a function named main.

Check to ensure that all variables are defined before they are used, in the same function. We will have no global variables; every variable will have function scope. Check that all function calls are to functions defined in the file (or are built-ins). Also check that no variable (either assigned or used) or function names begin with an underscore. We'll use underscore variables for temporaries. Test that the variable bound by a FORALL does not shadow any existing variables. Test that the POST condition doesn't refer to any local variables, only to function arguments and RETURN.

Check to make sure PRE, POST, INV, DEC, and print are never called except as a statement. Check that FORALL is never called, and RETURN is never used, except inside PRE, POST, or INV.

After all checks, use compile to compile the AST and run it. Delete the first command-line argument, so that the code, when run, accesses the second and onward command line arguments, like it would be if run directly by Python. You can test later steps by running files under both Verified Python and normal Python.

Modify expressions to eliminate nested subexpressions. Assignments, function calls, and conditionals contain expressions that need to be flattened. Pay attention to correctly flattening and and or to preserve their short-circuiting behavior. while conditions do not need to be flattened. To flatten an expression, extract subexpressions into a new variable, named beginning with an underscores. Skip calls to PRE, POST, INV, and DEC. The behavior of programs should not change.

Implement a type checker. For assignments, it should infer the type of the variable based on the assigned value. For function calls, it should check that arguments are valid against the function's annotation. Do not type check the main block. Allow print to be called with any type (it is for debugging).

Write a function to convert propositions, like the contents of PRE and POST-conditions and INV-ariants, to Z3 propositions. Be careful with FORALL.

Generate verification conditions for function bodies containing assignments. Write a function that takes a function definition and returns a one-element list of Z3 propositions, with that one element correponding to P → WP(Q).

For each function, generate its verification conditions, negate each one, and pass it to Z3. If Z3 returns sat, extract Z3's model and print each variable's value, skipping temporary variables. Also identify the function that failed.

Verify an "add1" function that computes one more than its input.

Add support for print statements. They should not do anything to verification conditions.

Generate verification conditions for conditionals.

Verify the absolute-value function.

Generate verification conditions for loops. Generally speaking, a function with a single loop should have three verification conditions: one for "entry", one for the "step" case, and one for "exit". The "step" and "exit" conditions need to be universally quantified over the state. When verifying a loop returns sat, identify which of its three verification conditions failed in the error message. Also print the line number of the loop.

Write a counter, which starts at 0 and increments until it hits an argument n, and verify that it returns n.

Test the squaring function. Nested loops are tricky.

Add support for measures with DEC. The measure expression should be an integer that starts, stays, and ends non-negative. Each iteration of the loop should decrease it. Add a termination verification condition to each loop.

Allow function calls. Generate verification conditions for function calls using those functions' pre- and post-conditions. Continue to treat calls to PRE, POST, INV, DEC, and print specially.

For recursive calls, check that the first argument to the function is an integer and that it decreases in recursive calls.

Check that no functions are mutually recursive. That is, build the call graph, and check that the only loops are direct recursion.

Add support for arrays and the len function, specifically the array created by in the main block. Arrays are tricky because they correspond to two Z3 variables: one for the array and one for its length. Z3 arrays are infinite-length and do not have lengths. The main function may have a PRE-condition specifying some kind of constraint on the command-line input; that's valid but won't be enforced by Verified Python.

Add support for array subscripting. Add verification conditions to check that subscripts are in bounds.

Add support for modifying array elements. Treat the assignment a[i] = v as a = store(a, i, v).

Check that arrays are are never assigned to variables, so that there is only one reference to each array. Check that no array is passed twice to a function. There are no nested arrays.

Modify the verification conditions for functions to take into account that any arrays passed to the function may have been modified. To do so, universally quantify over those arrays in the exit half of the verification condition.

Write and verify a linear search function.

Write and verify quicksort.