A DSL for Math Library Implementation

Implementing a mathematical function like sin or exp is tricky! Yet if you want to trade off accuracy and input range, for speed or vectorization, you'll need to do that, and likely once for each architecture you need an implementation for. In the process, you'll need to decide on range reduction strategies, working precisions, polynomial orders, and function-specific tricks. So my student Ian Briggs and I are working to make it easier.

I don't want to get deep into the weeds here, so I'm just going to sketch out the overall approach. Our goal is a safe, verifiable, and productive numerical library workbench.

Numerical Assembler

What makes a library function implementation? This is a tricky question, because we are not looking for an exact implementation: we will always have some rounding error, and often we are willing to be quite inaccurate for the purpose of speed. But arbitrary code shouldn't qualify. This is not an implementation of sin, for example:

float sin(float x) {
    system("rm -rf .")
}

So the the lowest level of our workbench is a "numerical assembly language" with primitives like evaluating polynomials, doing additive and multiplicative reductions and reconstructions, and using lookup tables and branches. Despite the name, numerical assembly lives at roughly the same level as C, except that it has no access to wild pointers or system calls. The safe assembler looks something like this:

impl sin(float x) {
    split_float x -> s e m
    mk_float 1 e m -> y
    poly[1: 1.0, 3: 0.167] y -> z
    mul z s -> r
    ret r
}

This code snippet removes the sign bit from x, evaluates a polynomial at the resulting absolute value, and multiplies again by the sign bit, leveraging the fact that sin is an odd function. The numerical assembler compiles to C, so redundant operations are optimized out and so we don't need to concern ourselves with instruction selection.

However, you can still write nonsense in this numerical assembler. For example, this isn't an implementation of sin either:

float sin(float x) {
    int t = * (int *) (float *) &x;
    int e = (t & 0x7f800000 >> 23) - 127;
    float z = (float) (t ^ (e << 23) | (0x7f << 23)) - 1.0f;
    float pz = z * (1.0 - ...);
    return pz + e * 0.693...;
}

One sense in which it's clearly not an implementation of sin is that it's an implementation of log… but if it weren't it wouldn't necessarily be any more of a sin implementation! And though it's true that this is not a particularly accurate implementation of sine, it's accurate on at least a few points, and besides that it's not like this is a very inaccurate implementation of sin. It just isn't an implementation at all.

Numerical Lambda Calculus

So the next step of our workbench is a "numerical lambda calculus". The terms in this calculus describe implementations of math functions, and are typed to ensure that they are sensible. Specifically, terms have types like Impl<f, a, b>, which describes implementations of the mathematical function f for inputs between a and b inclusive. These implementations can be as accurate or inaccrate as you want—but the type discipline constrains what you can do. Let me give you some examples.

The simplest term of the lambda calculus refers to existing math libraries. For example, the term GLibC.sinf has type Impl<sin, -inf, +inf>; in other words, GLibC's sinf function implements the mathematical sin function over the complete real line. Another term of the lambda calculus is HornerPolynomial<f, a, b>([a0, a1, ...]), which has type Impl<f, a, b>. In other words, any particular function can be implemented by directly giving its polynomial.

Other terms might combine implementations. For example, consider the term DoubleAngle<a, b>(i1, i2), which has type:

Impl<sin, a, b> * Impl<cos, a, b>
->
Impl<sin, 2a, 2b> * Impl<cos, 2a, 2b>

In other words, the double angle formula can be used on implementations of sine and cosine to double their range. This is valid:

let sin_impl = HornerPolynomial([ ... ]) in
let cos_impl = HornerPolynomial([ ... ]) in
fst(DoubleAngle<0, pi/4>(sin_impl, cos_impl))

However, this is not:

fst(DoubleAngle<0, pi/4>(GLibC.expf, GLibC.y0))

Even though that could be compiled to C code and would compute floating-point ouputs, it's nonsensical to describe this as implementing the double angle identity! It just computes some random stuff.

What's most interesting about tracking what function you are implementing is the ability to leverage and check mathematical identities. For example, for odd functions, it's common to implement a math function using the absolute value trick demonstrated above. That's represented by the AbsReduction<f, b> term, which has type:

∀ x, -b ≤ x ≤ b → f(-x) = -f(x)
-------------------------------
Impl<f, 0, b> -> Impl<f, -b, b>

Here the type contains a mathematical identity: for an invocation of AbsReduction to be valid, a certain mathematical fact has to be true about the underlying function. This constrains the sorts of reductions and operations that are allowed, and guarantees that library function implementations are not only "safe" but also "sensible".

Type Checking and Soundness

The type system this requires is simple but pretty unusual. On the one hand, the langauge does not seem to need complicated type dependencies or generics, which is a huge relief. However, it does have a few nontrivial equality and subtyping rules. First, the range of validity creates subtypes:

     a ≤ c          d ≤ b
------------------------------
Impl<f, a, b> ◁ Impl<f, c, d>

Secondly, mathematical identities induce type equalities:

∀ x, -a ≤ x ≤ b → f(x) = g(x)
------------------------------
Impl<f, a, b> = Impl<g, a, b>

This rule lets you implement sin(x) via cos(pi/2 - x).

To ensure soundness, each type Impl<f, a, b> corresponds to a pointed set of functions \(\mathbb{R} \to \mathbb{R}\), with the mathematical function \(f\) as the distinguished point. Most of these functions are not implementable in floating point but that lets us state our soundness property.

Soundness requires that functions correspond to maps between pointed sets. In less abstract terms, it requires function Impl<f, a, b> -> Impl<g, c, d> must map the true function \(f\) to the true function \(g\).

Writing out type parameters everywhere to explain what function you're implementing, and on what range, is annoying, so we'll need a type inference procedure. I'm hoping that there is a principle type theorem for the type system, so that it's at least doable. Without generics that might be true!

Both checking and inference are going to require proving mathematical identities. We'll probably use something like Mathematica here. Or we might even just use a hard-coded list of identities. Or e-graphs! It seems like an orthogonal problem and it's not clear that this part needs to be very good for the overall tool to work well.

Verification

We're planning pluggable verifiers that bound the worst case accuracy of a given function implementation. Tools like FPTaylor can bound worst case error, and we'd apply them on smaller or larger chunks of the expression, like in Satire. What's difficult here is that math libraries often need to do mixed integer/float computations, and existing FP verification tools aren't made for that. But we could try to turn a single integer/float verification problem into some large number of float-only queries to FPTaylor.

Synthesis

Now that we have a language where we can only express sensible implementations of various math functions, we can directly enumerate all possible valid terms in the language, and that would give us a huge number of implementations of whatever function the user is interested in. So instead of having to hand-write an implementation, the user could generate a lot of implementations, and then select the one with the best speed / range / accuracy / whatever tradeoff in their particular use case. If multiple architectures are in play, that would be even more attractive.

Of course directly enumerating polynomial coefficients isn't a good idea. Our synthesizer will also build in tools like Sollya or the new RLibM techniques to generate polynomial terms, and similar for generating lookup tables or whatnot.

By Pavel Panchekha

28 January 2021