How Cassius Works

In Cassius, I am making it possible to mechanically reason about the layout of a web page. It's been a lot of work! The hardest part has been writing a logical specification for most of the CSS rendering algorithm; in other words, the hardest part is the part where I write a web browser. But there are some core ideas in Cassius separate from the CSS specification, and to make those clear in this post I'll write a toy version of Cassius from scratch.

Toy CSS

To make toy Cassius somewhat shorter than the 7.4k lines of code in the real thing, I'm going to build my toy Cassius to render toy web pages that use toy CSS. In particular, web pages will look like this:

(elt ()
     (elt ([width 5])
          (square ()))
     (elt ([height 2])
          (elt ([height 3]))))

There are two types of elements here: elt and square, and both have a list of associated properties. Both are laid out vertically, one after another. An elt is as wide as its parent unless its width is set, and as tall as the sum of its children's heights, unless its height is set. A square, on the other hand, is always the width of its parent and is always square, so that its height is equal to its width. The document above, for example, would have a height of 7.

This is a pretty spare toy version of web page rendering, but it keeps a few features that I think are essential. First, it uses a tree for layout, and the rendering algorithm is local.¹ [¹ In toy CSS it is truly local, in real CSS it is only kind-of local, but the point mostly stands.] Trees are tricky for equation solvers, which Cassius uses extensively, so this drives some of Cassius's techniques. Second, the rendering of a page depends on the "browser width", or the default width of the root element. This means that a web page's appearance is a function of some variable, and makes verification an interesting question.² [² Real CSS has many more parameters, like the default font, the default font size, the browser height, the user's color preferences, as well as things like the browser being used and which resources from other sites are available.] Third, there is a dependence of height on width that goes through child elements. So that width input parameter affects the rendering in significant ways.³ [³ In real CSS, width affects a lot more, thanks to text that breaks across lines, floating boxes, and other things like that.]

Note that I am dropping cascading,⁴ [⁴ And style sheets too, by the way.] almost all CSS properties, and even the second dimension, since there's no way to move horizontally across the page. Still, the essential elements of Cassius are here.

Implementing Toy CSS

Before implementing Toy Cassius, let's start by implementing the analog of a web browser for Toy CSS; that is, let's implement a program to find the rendered size and location of every element for some particular browser width. I'll be doing this in Racket, and I'll be using a few common Racket utilities without explaining them.

Rendering a web page means converting a document:

(define-pattern document?
  `(elt ([,(? symbol?) ,(? number?)] ...) ,(? document?) ...)
  `(square ()))

into a rendering:

(define-pattern rendering?
  `([,(? number?) by ,(? number?) at ,(? number?)]
    ,(? rendering?) ...))

These define something like types in Racket.⁵ [⁵ They're called “contracts” and they're checked at runtime.] For documents, I'm not leaving room for toy CSS properties or for children on squares. For renderings, I have a width and a height (that's the ,(? number?) by ,(? number?) bit) and also a y-position (that's at ,(? number?)). I don't have an x-position since Toy CSS has no way to move left or right.

To turn a document into a rendering, we need to know the browser width:⁶ [⁶ Here the <...> is not part of the code but something I'm leaving as I start to fill in the function.]

(define/contract (layout doc #:width [screen-width 1000])
  (-> document? #:width number? rendering?)

  `(<...>))

Now the document is a tree, so layout is going to be recursive. We'll keep track of the current y position y and the current width w as we go down the tree:

(let loop ([elt doc] [y 0] [w screen-width])
  (match elt
    [`(square ())
     <...>]
    [`(elt ([,props ,values] ...) ,children ...)
     <...>]))

Let's render squares first:

[`(square ())
 `([,w by ,w at ,y])]

Remember that w is the parent's width and y is the current y position. Squares don't have any children so there aren't any recursive calls in this variant.

Elements are tougher. First of all, they have toy CSS properties we'll need to extract:

[`(elt ([,props ,vals] ...) ,children ...)
 (define (prop name default) (dict-ref (map cons props vals) name default))
 (define w* (prop 'width w))
 <...>]

Now, before computing the height of this element, we need to render all of its sub-elements and add up their heights. This is a little tricky since it involves tracking the current y-position. I use Racket's for/fold construct:

(define-values (final-y subrenderings)
  (for/fold ([y* y] [subrenders '()]) ([child (in-list children)])
    (define subrendering (loop child y* w*))
    (match-define `(,w-child by ,h-child at ,y-child) (first subrendering))
    (values
     (+ y* h-child)
     (cons subrendering subrenderings))))
<...>

There's a lot going on here, if you're not familiar with Racket, but the short of it is that for each child element ([child (in-list children)]) we recursively render it ((loop child y* w*)) using the current y-position and width. That gives us a whole tree of rendered elements, from which we take the first and extract its h-child, which we add to the current y-position y*. We also add (cons) the rendering of the child onto a list of renderings in general.

Now that we've rendered our element's children, we can produce its rendering:

(define h* (prop 'height (- final-y y)))
`([,w* by ,h* at ,y] ,@(reverse subrenderings))

This layout function can now render the web page from before:

(layout test-page #:width 10) ⇝
  '([10 by 7 at 0]
    ([5 by 5 at 0]
     ([5 by 5 at 0]))
    ([10 by 2 at 5]
     ([10 by 3 at 5])))

This is a rendering of the example web page above and, as expected, it is height 7.

Embedding into SMT

Cassius, and therefore toy Cassius exists to encode that rendering algorithm into an SMT formula. But before we can encode the rendering algorithm proper, we need to set up a framework for representing render trees in SMT.

(define/contract (layout->z3 doc)
  (-> document? (listof smt?))
  <...>)

The core data type in our encoding of toy CSS to SMT is the "box", which is what I'm calling each node in the render tree. Real CSS has different trees of elements and boxes,⁷ [⁷ Obscure fact! A simple example is a <b> that splits over two lines.] but in toy CSS the two trees have the same structure.

A box has a height, a width, and a y position, just like in our layout function; there is also a no-box value, which will come in handy in a moment.

(declare-datatypes () ((Box no-box (box (id Int) (w Real) (h Real) (y Real)))))

Boxes have an identifier. This identifier is very important! We intend for it to be possible for two boxes to have same same width, height, and position without being the same box. So we must add an identifier so that they can be different. Forgetting the identifier leads to very strange bugs.⁸ [⁸ Roughly, if we have two equal boxes, Z3 decides that they must have equal pointers, which usually results in no-box = (box ...), which is false. Figuring out this bug cost me three hours.]

Now that we have a type for boxes, we can define the boxes we expect in our rendering: ⁹ [⁹ Here, for/reap is a helper function I wrote that runs a for loop and returns a list of all things sow was invoked with, and sformat is string formatting (it replaces ~a by id) but it produces a symbol instead of a string.]

,@(for/reap [sow] ([elt (in-tree doc)] [id (in-naturals)])
    (sow `(declare-const ,(sformat "box~a" id) Box))
    (sow `(assert (is-box ,(sformat "box~a" id))))
    (sow `(assert (= (id ,(sformat "box~a" id)) ,id))))

What this does is: define a box for each element to render to; assert that they are actual boxes, no no-box values; and give each their names. For now though, we've only defined the boxes, not organized them into a tree. To do that, we need to add tree pointers:

(declare-fun parent (Box) Box)
(declare-fun next (Box) Box)
(declare-fun previous (Box) Box)
(declare-fun first (Box) Box)
(declare-fun last (Box) Box)

Here first and last are pointers to a box's first and last child; note that by going to the first child and then following previous/next pointers, you can get to any element. There's no way in Z3 to talk about arbitrary-size data structures¹⁰ [¹⁰ Ok, not true—there are arrays and sets—but they're expensive to reason about, so it's best to avoid them.] so this five-pointer scheme is what we use in Cassius to balance ease of access with ease of reasoning.

By the way, in actual Cassius we use a much weirder and more baroque scheme that involves inventing pointers for each box and indirecting through those pointers, but I no longer remember why I did it that way and think it was a mistake. Certainly this way works fine in toy Cassius, and I've been playing with a version in real Cassius that also seems to work.

Now we need to fill in pointers for each of the boxes. It's a bunch of code, but the output is a straightforward blog of equality assertions, so if you can't follow the code, don't sweat it:

(assert (is-box (parent ,(id doc))))
(assert (is-box (previous ,(id doc))))
(assert (is-box (next ,(id doc)))))

,@(for/reap [sow] ([elt (in-tree doc)])
    (match elt
      [`(square ())
       (sow `(assert (= (first ,(id elt)) no-box)))
       (sow `(assert (= (last ,(id elt)) no-box)))]
      [`(elt ([,props ,vals] ...) ,children ...)
       (sow `(assert (= (first ,(id elt))
                        ,(if (null? children) 'no-box (id (first children))))))
       (sow `(assert (= (last ,(id elt))
                        ,(if (null? children) 'no-box (id (last children))))))
       (for ([child children] [previous (cons #f children)]
             [next (cdr (append children (list #f)))])
         (sow `(assert (= (parent ,(id child)) ,(id elt))))
         (sow `(assert (= (previous ,(id child)) ,(if previous (id previous) 'no-box))))
         (sow `(assert (= (next ,(id child)) ,(if next (id next) 'no-box)))))]))

Here every element is responsible for setting the parent, next, and previous links of its children, plus its own first and last links, and that necessitates setting the parent, next, and previous links of the root element, doc, separately. I'm using a simple function called id that maps elements to box12 or whatever; it works like this:

(define id
  (let ([ids (for/hash ([elt (in-tree doc)] [id (in-naturals)])
               (values elt (sformat "box~a" id)))])
    (λ (elt) (dict-ref ids elt))))

I've now constituted the tree of boxes inside Z3. Next we have to assign their toy CSS properties to them. Since property values can be missing, I'm going to be using a custom data type for property values:

(declare-datatypes () ((PValue no-value (value (val Real)))))

We need only two properties, width and height; don't confuse these functions for the w and h fields that boxes have!

(declare-fun width (Box) PValue)
(declare-fun height (Box) PValue)

,@(for/reap [sow] ([elt (in-tree doc)])
    (match elt
        [`(square ())
         (sow `(assert (= (width ,(id elt)) no-value)))
         (sow `(assert (= (height ,(id elt)) no-value)))]
        [`(elt ([,props ,vals] ...) ,children ...)
         (define (prop name)
           (if (set-member? props name)
               `(value ,(dict-ref (map cons props vals) name))
               'no-value))
         (sow `(assert (= (width ,(id elt)) ,(prop 'width))))
         (sow `(assert (= (height ,(id elt)) ,(prop 'height))))]))

Again, it's a big mess of code with a lot of nested quasi-quotes and commas, but it just spits out the right assignments to width and height. By the way, here I'm defining squares to have no-value for width and height; this is a little bit of a hack, since it happens to be that the default behavior for squares is like they don't have a width set, so I will only need to special-case height later on. Sometimes similar hacks happen in real CSS, though, so it's not that unrealistic.

Ok, there's one final bit of information we need to get out of documents and into Z3: we need to know which boxes are squares:

(declare-fun is-square (Box) Bool)
,@(for/reap [sow] ([elt (in-tree doc)])
    (match elt
      [`(square ()) (sow `(assert (is-square ,(id elt))))]
      [`(elt ,_ ,_ ...) (sow `(assert (not (is-square ,(id elt)))))]))

Wow, that's a lot of information to unpack out of the toy web page into an SMT encoding! Luckily, this part of the code doesn't have to change that often. Now we are almost ready to define the rendering function. Before we do, we need to define a parameter for the browser width:

(declare-const view-width Real)
(assert (> view-width 0.0))

I've made the view width strictly positive. Negative view widths cause problems, and I don't see what good comes from a view width of 0.

Now, the rendering function:

(define-fun renders ((b Box)) Bool
  (and
   (= (y b) (ite (is-box (previous b))
                 (+ (y (previous b)) (h (previous b)))
                 (ite (is-box (parent b))
                      (y (parent b))
                      0.0))) ; Root box
   (= (w b) (ite (is-value (width b))
                 (val (width b))
                 (ite (is-box (parent b))
                      (w (parent b))
                      view-width)))
   (= (h b)
       (ite (is-square b)
            (w b)
            (ite (is-value (height b))
                 (val (height b))
                 (ite (is-box (first b))
                      (- (+ (y (last b)) (h (last b))) (y (first b)))
                      0.0))))))

You can see that for y we have three cases (first child, middle child, and root element), for w also three (set width, child, root) and for h four: (squares, set height, parent, and default). Of course in real CSS there are many, many more cases, and there are also lots of intermediate fields involved, but this toy model doesn't have all of that.

Now that we have a rendering function we must call it. Note that renders maps boxes to booleans. Instead of thinking of it as a function, think of it as a property of functions, which of course we now assert:

,@(for/reap [sow] ([elt (in-tree doc)])
    (sow `(assert (renders ,(id elt)))))

This all the encoding we have to do! It's a lot of code, I know (though, you know, this is the toy model). But that's because even basic geometric concepts plus a tree of boxes plus properties is already pretty complicated! It's what I built this toy model to show.

Using this model, you can render web pages by specifying the view-width and calling (check-sat) but I have an idea that I think would be a little bit more fun.

Using the SMT encoding

I thought it would be cool to build a basic verifier for our toy CSS. The basic idea is that if you want to test some property of a web page, you can assert that it is false. Then Z3 will look for a view-width where the assertion is false. If it finds one, that is a counterexample, plus Z3 will give you the complete rendering on which you can verify that the assertion is broken; or, if none are found, then the assertion is true for all view-width values, so it's been verified.

In code:

(define/contract (verify doc property)
  (-> template? boolean?)
  (define-values query (layout->z3 doc))
  (define query*
    `(,@query
      (assert (not ,property))))
  (match (z3-solve query*)
    [(list 'model model)
     false]
    [(list 'core _)
     true]))

Here I am using z3-solve from the Cassius code-base, which runs Z3 and communicates with it over standard in/out.¹¹ [¹¹ Yes, there's a Z3 API; no, I don't have a good reason to use the command-line instead. It was just the fastest way to get up and running.] You can see that I am adding the negated assertion, asking Z3 to check satisfiability, and returning false if a model is found (not verified) a true if a model is found (verified).

It works:

(verify test-page '(= (h box0) 7)) ⇝ #t

This assertion checks that the root box is 7 high, no matter the initial browser width.

From toy to reality

In toy Cassius, there aren't too many interesting assertions to check. However, in real Cassius, there are all sorts of assertions to check, from making sure buttons stay inside their toolbars to making sure elements don't leave a design grid. But the same basic approach—construct the render tree, define the rendering functions, and assert that they run—still applies, and I think that is really the core of my approach. But I wanted to list a few of the differences as well:

Real CSS needs to be matched and cascaded
The element tree, separate from the box tree, must exist and defines inheritance
There are different rendering modes in real CSS, from text to positioning to floats
There are a few special kinds of elements in real CSS, like images and text
Real CSS has various intermediate state at every box, like shrink-to-fit width or margin collapsing state
Real CSS is massive; Cassius supports 42 CSS properties at least partially
Real CSS has a lot of wacky edge cases
Fonts are a big pain in real CSS
There are browser stylesheets and browser defaults in real CSS

All of these require a lot of code and a lot of work to correctly define, and if you want to see more of that, I'd recommend diving into the Cassius source code.

Footnotes:

In toy CSS it is truly local, in real CSS it is only kind-of local, but the point mostly stands.

Real CSS has many more parameters, like the default font, the default font size, the browser height, the user's color preferences, as well as things like the browser being used and which resources from other sites are available.

In real CSS, width affects a lot more, thanks to text that breaks across lines, floating boxes, and other things like that.

⁴

And style sheets too, by the way.

⁵

They're called “contracts” and they're checked at runtime.

⁶

Here the <...> is not part of the code but something I'm leaving as I start to fill in the function.

⁷

Obscure fact! A simple example is a <b> that splits over two lines.

⁸

Roughly, if we have two equal boxes, Z3 decides that they must have equal pointers, which usually results in no-box = (box ...), which is false. Figuring out this bug cost me three hours.

⁹

Here, for/reap is a helper function I wrote that runs a for loop and returns a list of all things sow was invoked with, and sformat is string formatting (it replaces ~a by id) but it produces a symbol instead of a string.

¹⁰

Ok, not true—there are arrays and sets—but they're expensive to reason about, so it's best to avoid them.

¹¹

Yes, there's a Z3 API; no, I don't have a good reason to use the command-line instead. It was just the fastest way to get up and running.

By Pavel Panchekha

04 February 2018