Pavel Panchekha

By

Share under CC-BY-SA.

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation.

Programs as Patchwork

One trend I'm seeing in PL research is “programs as patchwork”, where different aspects of one program are written in different DSLs.

Halide

The best example of this kind of work is Halide, a language for writing fast matrix operations (for graphical filters). Halide is originally a product of academia but is now widely used in industry.

For example, a 3×3 blur in Halide would be described by first defining the computation:

blur_x(x, y) = (input(x-1, y) + input(x, y) + input(x+1, y))/3;
blur_y(x, y) = (blur_x(x, y-1) + blur_x(x, y) + blur_x(x, y+1))/3;

Then, separately, you would define a schedule:

blur_y.tile(x, y, xi, yi, 256, 32).vectorize(xi, 8).parallel(y);
blur_x.compute_at(blur_y, x).vectorize(x, 8);

Here, the computation tells you how to compute each individual cell of the output matrix, from the input matrix; the schedule tells you what order to compute those individual cells in. For example, here, the schedule tells us that we should compute the X blur and the Y blur at the same time (using compute_at), and that we should compute the blur in 256×32 tiles.

Note that though Halide is nominally a single language, the computation and DSL sublanguages are effectively disjoint, and though the syntax for schedules uses dots to look like function calls, the calls don't return anything and so are more like declarations. It is a syntactic trick to make the language feel like one C++-like language.

Structured Parallelism

Another example from the research literature is the paper Farms, Pipes, Streams and Reforestation from ICFP’16, by D. Castro, K. Hammond, and S. Sarkar. This paper proposes a simple functional language for defining streaming functions, and a collection of primitives in that language that describe various structured techniques for introducing parallelism: pipelining, parallel farms, and so on.

For example, in their language one can write a simple quicksort algorithm as recursive partitioning and merging:

qsort_multiple = map (hylo merge partition)

Then, you can separately add predicates like farm, ||, and par to make this parallel:

qsort_multiple = par (farm n (fun (ana partition) || fun (cata merge)))

Here, we've indicated that to quicksort multiple lists, we want to set up n independent workers to split the lists among, and then have each worker set up a producer-consumer queue where the producer partitions lists and the consumer merges them.

In this paper, the two languages appear more mixed, but the parallel hints like par, farm, fun, and || have no effect on the value computed, and their validity is handled in the type system.

Venture

My final example is from Venture, a probabilistic programming language from V.K. Mansinghka's group at MIT. Like most probabilistic programming languages, Venture allows you to define a probabilistic computation using some unknown constants, and then infer those constants from a collection of observations. However, Venture also allows you to describe what algorithm that inference should use.

For example, you can define the model and observations:

assume x = normal(0, 1);
assume y = normal(0, 1);
observe normal(x + y, 1.0) = 3;

Then, to infer x and y, you could write:

repeat(1000, {
  infer gradient_ascent(minimal_subproblem([x, y], 0.01))
});
repeat(100, {
  infer resimulation_mh(minimal_subproblem(random_singleton([x, y])))
});

As before, the model and observations introduce the “what” but the inference code describes the “how”.

Why now?

I think a key reason these sort of “programs as patchwork” languages are emerging now due to a combination of two effects.

First, more and more effort is going into specialized programming languages, like for graphics or streaming, where the types of computations are limited (making design of the computation language easy) and secondary characteristics like parallelism are both important and uniform (making it worth designing a language for it).

Second, programming languages folk have gotten pretty good at defining type systems and static analyses, and in these patchwork languages that skill is put to use to make sure that the two language “stitch together”: that the secondary language can apply to the computation given, say. This makes two-language programs much easier to write.

Years ago Jerry Sussman introduced me to this idea under the name “partit” programming, with examples like annotating the precision of floating-point intermediates or the underlying algorithms for matrix operations. It seems like static analysis tools and the increasing interest in specialized programming languages have made this approach increasingly popular. I hope the trend continues.