# Sub-specifications

My work on formally specifying web page layouts often has people
asking how web page layouts—which after all are esthetic as much as
they are functional—could possibly follow a specification in pure
logic. I understand the confusion: we too often think of a program as
having an *end-to-end* specification, and that the point of verification
is to allow us to treat the program as a black box, secure in the
understanding that it follows the spec. That is an unusual special
case. Most programs have not one specification, but many
specifications on different levels.

## Implementing Fibonacci

Consider a simple example: computing the Fibonacci numbers. The specification for this function is pretty simple:

Inductive Fib : nat -> nat -> Prop := | fib0 : Fib 0 1 | fib1 : Fib 1 1 | fib_plus : Fib n x -> Fib (S n) y -> Fib (S (S n)) (x + y)

Then you go ahead and implement some function `fib`

and prove that it
works by proving `Fib n (fib n)`

.^{1} [^{1} Maybe you also go and prove some
facts about the spec, like making sure that `Fib`

is a functional
relation.] This works,^{2} [^{2} The induction principle might be a little
tricky.] and you can now ignore `fib`

and reason directly about the `Fib`

specification.

But consider that there are lots of ways to implement the Fibonacci function: plain recursion, memoization, an array, a two-variable loop, the matrix method, the implicit matrix method, or a number field method. The specification I proved above tells us that we correctly implemented the Fibonacci function, but it doesn't tell us that we correctly implemented, say, the implicit matrix method.

What can be confusing here is that the implicit matrix method
satisfies the same end-to-end specification as even the naive
Fibonacci function.^{3} [^{3} It has different asymptotics, but this isn't
enough to distinguish, say, the implicit matrix from the number field
methods.] Instead, it has a different specification at a different
layer: an implementation of implicit matrix Fibonacci has a data
structure for representing implicit matrices (the representation can
be verified), for adding and multiplying them (the operations can be
verified), for implementing fast exponentiation (perhaps verified
against the specification for multiplication), and so on. And of
course you would also need to prove that the matrices chosen correctly
implement the Fibonacci function.

These intermediate specifications are probably not accessible to callers of the Fibonacci function, but they are accessible to different parts of the Fibonacci implementation. In fact this implicit matrix Fibonacci function has a rich internal structure, with a lower level implementing implicit matrices and a higher level implementing Fibonacci, and the lower level ought to be opaque to the higher level.

## Designer choices

Given this richly-structured set of specifications, verification can
offer many tools even when it cannot do anything for the highest-level
goals like attractiveness: lower-level specifications can often be
formalized even the high-level ones can't. A good example of this is
the Neutrons project at UW,^{4} [^{4} Done by my very talented colleagues
Stuart Pernsteiner, now at Galois, and Calvin Loncaric, now at Oracle,
with a lot of the UW PLSE faculty advising them.] the goal of which
was to formally verify the software that control's UW Medicine's
neutron therapy machine. No one knows enough about the world to prove
that the machine cures cancer,^{5} [^{5} Let alone “often enough”.] but it
is definitely possible to prove that the beam turns on and off when
it's supposed to, and that's definitely part of the plan the neutron
therapy machine's designers developed for how the machine ought to
work. Neutrons combines multiple formally-verified static analysis
tools with a very domain-generic Alloy framework for combining these
tools together. Here Alloy plays the role of allowing less-formal
reasoning where each sub-specification an be proven not only using
different tools but using different models of the world and different
assumptions.

In web page layouts, the same pattern occurs. The overall goal of the page involves things like attractiveness, usability, certain functionality, and accessibility. This is the high-level, end-to-end specification—unfortunately, it is, as far as we know now, unformalizable. To meet that specification, the designer makes certain decisions, like the choice to have a two-column layout with toolbars and a footer. These decisions aren't yet a program: they are a new, more refined specification. To implement them, there are yet further lower-level specifications: the footer has links on the left in three columns and the address and contact info on the right; there are three toolbars, corresponding to "store-wide", "department-wide", and "item-specific" actions; and so on. These levels go lower and lower, until you have specifications like "enticing color" and a concrete implementation like a particular shade of blue from a color palette. Again, even when the highest-level decisions are difficult to review or understand, the lower-level ones can be verified, and mistakes often caught.

## Sub-specifications in synthesis

A useful perspective is to think of writing a program (whether
manually or using program synthesis) as the task of developing finer
and finer sub-specifications, until the specifications get as granular
as "increment `x`

". This perspective explains how even large programs
can be synthesized by humans, even though there are a near-infinite
number of even 100-line programs, let alone the million-line
monstrosities frequently written by teams: instead of writing a
million lines of code at once, people write a dozen lines of
specification at a time, thousands of times.

This idea is being pursued more and more in automated program synthesis as well. N. Polikarpova and I. Sergey's recent SuSLik paper is a good example: to synthesize mutating, heap-manipulating programs, SuSLik starts with a high-level specification for the behavior of the program, and then transforms the specification using a variety of rules. Some of those rules add code to the program; others don't. N. Polikarpova's earlier Synquid project did something similar, except instead of Separation Logic for specifying heap-manipulating imperative code, it used Refinement Types for specifying functional programs.

## Synthesis as planning

I've been reading classic AI papers that separate planning into “plans as execution” and “plans as communication”. When you think of plans as execution, you think about plans as programs, and using a plan as automatic. Then a plan has to have a complete model of the outside world in it, in order to respond to the complexity of the outside world, and a plan is usually very large to account for the unpredictability of the world. When you think instead of plans as communication, you stop believing that executing a plan must be purely mechanical. Since the executor is allowed to improvise and use domain-specific knowledge, the plan itself can be simpler and shorter, in an asymptotic sense.

As I read these papers, more and more I am thinking about these
synthesis tools not as ways of synthesizing programs, but as ways of
synthesizing *useful sub-specifications*. These sub-specifications are
not executable—they might require further synthesis—but they
communicate important information about how to accomplish a task and
how to organize information. I try to imagine: what if the synthesis
task was interrupted half-way through. What information could it "hand
off" to a human, so as to pass along as much useful information as
possible.

## Useful communication

Now, what *useful* means there is tricky. It is always possible to
refine a specification into “do the thing, then do nothing”. In
purely-enumerative synthesis approaches the only measure of usefulness
might be implicit, with useless specifications just failing to
eventually produce a program. But in a more directed tool like SuSLik,
there are a variety of ways to tell that a specification is more
useful, like if it operates on sub-heaps, or less useful, like if the
input heap provably does not the information necessary to produce the
output heap. It might be possible to have SuSLik return, even when it
fails to synthesize a program, a set of sub-specifications that it
believes are possible to satisfy, for a human to fill in.^{6} [^{6} Note
that this is the opposite of *sketching*, where the human decomposes the
high-level specification into low-level specifications and leaves the
computer to do the final bit of synthesis.]

I'd like to spend some more time thinking about how to tell whether a sub-specification has usefully decomposed the parent specification, or whether it has simply hidden the problem elsewhere. It's tricky. For one, there has to be some notion of termination: if the sub-specification requires solving the overall problem, it probably didn't help (though what if it established a useful precondition?). Then, there's some notion of well-foundedness: you can't turn a specification about lists of length \(n\) into special cases for length 1, length 2, and so on forever (but it can be helpful to have a few). There is a informational challenge: each sub-specification has to have enough information in its precondition to establish its post-condition. What else is there?

I think there is also an interesting challenge in how to represent specifications. Even when high-level specifications are concise, lower-level specifications often include steps like “come up with a data structure to represent X”. Those steps are hard to represent in simple logics.

## Footnotes:

^{1}

Maybe you also go and prove some
facts about the spec, like making sure that `Fib`

is a functional
relation.

^{2}

The induction principle might be a little tricky.

^{3}

It has different asymptotics, but this isn't enough to distinguish, say, the implicit matrix from the number field methods.

^{4}

Done by my very talented colleagues Stuart Pernsteiner, now at Galois, and Calvin Loncaric, now at Oracle, with a lot of the UW PLSE faculty advising them.

^{5}

Let alone “often enough”.

^{6}

Note
that this is the opposite of *sketching*, where the human decomposes the
high-level specification into low-level specifications and leaves the
computer to do the final bit of synthesis.