Speeding up Skia using E-graphs
Skia is a fast and widely used 2D graphics library for tasks like rasterization. it's used in Google Chrome and Android, so speeding it up would have large impacts, and I think e-graphs have a lot of potential to drive an interesting research project in that direction.
A bit of Skia background
I don't know much about Skia, but I learned a lot from Chris Harrelson helping him edit a chapter about visual effects for the book on browsers we are writing together. Here's a quick summary of mental model, to give you a handle on what kind of things Skia does.
Skia has three main types of structured data in its API. The first is
shapes. These are things you can draw, stuff like circles, rounded
rectangles, and text (a very important type of shape). The second is
surfaces. A surface is a 2D array of pixels, which is where you draw
shapes. Note that Skia itself talks about surfaces using several
different surface-level API calls, like SaveLayer
and Surface
, but
internally at the end of the day Skia is either creating a 2D array of
pixels or not. Finally, there are paints, which are ways of drawing a
shape to a surface. There's more than one paint, because you can mix
colors in different ways, or clip things in various ways, draw things
rotated, and so on. Again note that Skia has multiple ways to change
paints, either via contexts or explicit paint objects, but again I'm
trying to abstract things into general concepts. There's actually
other stuff too, such as effects like blurs, but those are less
interesting to me at the moment.
The sequence of shapes, surfaces, and paints is like a program. It's not written as source code—it is expressed temporally through Skia API calls—but nonetheless it can be executed, it produces an output (typically a collection of output surfaces), and it has a clear semantics. In this code, we think of each paint as consuming time and each surface as consuming space. (Shapes don't really consume anything; in the Skia API they typically aren't even allocated data structures.)
Optimizing with E-graphs
Shapes, surfaces, and paints have various relationships. For example, rotating (a type of paint) a circle (a shape) has no effect. Or, painting a shape onto surface A and then painting that surface onto another surface B (something you can do) is the same as painting surface A onto surface B and then the shape onto surface B in some cases but not others. (It depends on the paints used!) So if you think of the sequence of shapes, surfaces, and paints as a program, and there being many equivalent programs, then some of those programs will be faster or use less space than others. Which makes optimizing Skia programs tempting. The browser book chapter has some examples of this kind of optimization.
E-graphs seem like a good match for this kind of optimization. E-graphs are good at equational reasoning with purely-functional ASTs, and Skia is kind of like this. Many of the equivalence relationships seem kind of equational-ish, and while the API is not purely functional (since painting to a surface modifies the surface) it's potentially possible to treat those surfaces as linear data structures and to enforce linearity when we extract from the e-graph instead of when synthesizing alternative implementations inside the e-graph itself. Also, while Skia programs are relatively large, they are still much smaller than, say, LLVM IR, and should fit into an e-graph comfortably.
Skia programs can be extracted from actual applications like Chrome. I
am told that code for doing this and generating "SkPicture" files
already exists. Now, what we'd extract would sit at a sort of
uncomfortable intermediate layer. What happens inside Chrome is that
Chrome generates its own representation of the graphical image it
wants to produce (called PaintOps), and then has a specialized pass
that walks this representation and generates a Skia program, applying
some optimization tricks.1 [1 I don't know the details here.] Then,
Skia itself also does optimization between its public API and its
internals, for example deciding whether paints with clipping are
implemented as actual paints or via extra surfaces. (This mostly
happens inside the GPU backend, which translates Skia API calls to
GrOp
types.) So it might be tempting to capture a Skia program at
some other layer (like internally within Skia to identify missing
optimizations that Skia could do, or internally inside Chrome to
identify missing optimizations Chrome could do). But on the other
hand, capturing data using already-existing code interfacing with a
public API is probably the best bet for sanity.
I think typically the goal should be optimizing for space, meaning number of surfaces. You could care about total surface memory or peak surface memory at any one time, but the reason this is the most important optimization goal is that surfaces typically live on the GPU, and GPU memory is limited. Moreover, at a higher level Chrome, the Skia client, will trade space for time (by caching more surfaces), so saving memory will turn into saving time as well.
One tricky bit to optimizing well is going to be handling error. When a shape is drawn to a surface, there is going to be some error due to rasterization. Error here means images don't look as sharp as they could. It's best to ignore this at least in the first attempt at this problem because my intuition is that programs that use fewer surfaces are also going to have less error (they rasterize less often). But you could also try to take this into account, either optimizing it locally (try to guarantee a maximally-sharp image) or trading off against it globally (try to optimize while keeping the image sufficiently sharp).
Impact
I think writing down a semantics for Skia programs, proving some equivalences about it, and building the infrastructure to extract Skia programs from Chrome are all interesting research tasks. They'd enable building an e-graph optimizer which would hopefully reveal some missing optimizations. I think the e-graph optimizer alone would be an interesting and valuable project, pushing the limits of e-graph optimization with things like linear reasoning or possibly in considering error.
If there are any interesting missing optimizations (probably?) they could be communicated to either the Skia or Chrome folks. This would have a large impact, since Android and Chrome together account for about 2.5 billion users. Overall, Skia is about 10% of Chrome runtime,2 [2 I went to Gmail and used the Performace tab in the devtools to record me opening a large email. Then in the "bottom-up" view in the bottom panel I looked for "Painting". In fairness, most of what we could optimize is the "Paint" subitem there, which was 2% in my test.] on a complex web page, so a 10% speedup to Skia would be a 1% speedup to Chrome, pretty valuable overall.
A longer-term impact could be to help Skia folks evaluate how new
primitives would affect optimization opportunities. For example, I
know that browser use cases in particular demanded an implementation
of the ClipRR
paint, which clips a shape to a rounded rectangle as it
is drawn. This paint, I believe, has fast implementations in some
backends but not others, and also while ClipRR
exists clips for other
shapes might not. Could the Skia folks use the e-graph optimizer to
evaluate the benefits of implementing this? For example, they might
add the primitive and see how much better the optimizer's results
would be. Importantly, since the e-graph optimizer requires relatively
little and relatively simple data (just algebraic reasoning) you could
do this before beginning the tricky work of actually implementing fast
GPU rasterization code.
Footnotes:
I don't know the details here.
I went to Gmail and used the Performace tab in the devtools to record me opening a large email. Then in the "bottom-up" view in the bottom panel I looked for "Painting". In fairness, most of what we could optimize is the "Paint" subitem there, which was 2% in my test.