Pavel Panchekha


Share under CC-BY-SA.

Taylor Expansions of Taylor Expansions

One of techniques Herbie uses to improve the accuracy of floating-point programs is to use a polynomial approximation to an expression when evaluating that expression directly would introduce too much rounding error.

For now, the polynomial approximations are Taylor expansions, which means that Herbie needs to know how to take Taylor expansions of symbolic expressions. The traditional way to do that would be to write a symbolic differentiator, but I went a different route with Herbie, and the results turned out to be mathematically very pretty.

Taylor Expansions

As far as Herbie is concerned, a Taylor expansion for a function \(f(x)\) is a stream of coefficients \([a_0, a_1, \dotsc]\), such that \[f(x) = a_0 + a_1 x + a_2 x^2 + \dotsb.\] Whether or not this series converges isn’t really important to Herbie, so we’re going to ignore that question.

The problem is to find Taylor series of functions. For simple functions, like square roots or exponentials, Taylor expansions are well-known and can just be programmed in. But for composed functions (like \(\exp(\exp(x))\) or \(\sin(x - x^2)\)), formulas generally aren’t known. I needed some algorithm to find these Taylor expansions automatically.

Things that don’t work

Mathematically, finding a Taylor expansion can be done by computing derivatives: \(a_i = D^i(f)(0) / n!\). This is easy to do by hand for simple functions like square roots or exponentials, but for compositions it is much more tedious. This tedium translates to slow programs: taking Taylor expansions on the computer by taking derivatives is exponential in many cases where all the functions involved are well-behaved.

Another approach might be to take single-variable Taylor expansions, and then substitute in the results. For example, to take the Taylor expansion of \(\exp(\exp(x))\), we first recall that \[ \exp(x) = 1 + x + \frac12 x^2 + \frac16 x^3 + \dotsb; \] we then substitute that in for \(x\) to get

\begin{align*} \exp(\exp(x)) &= 1 \\ &+\, (1 + x + \frac12 x^2 + \dotsb) \\ &+ \frac12 (1 + x + \frac12 x^2 + \dotsb)^2 \\ &+\, \vdots. \end{align*}

Unfortunately, this doesn’t give us a Taylor expansion. We first need to expand every term of the form \((1 + x + \frac12 x^2 + \dotsb)^n\), and then regroup terms by their power of \(x\). Worse, each of the infinitely many expansions will yield a power of \(x\), meaning that each coefficient of the Taylor series involves summing an infinite series. I’m not going to program Herbie to do that.

Things that work

Instead of doing either of these things, it’s better to use function-specific tricks. Instead of a general Taylor-series-composition mechanism, Herbie special-cases taking the Taylor series of arithmetic, exponential, or trigonometric functions, given Taylor series for their arguments. It then recursively constructs the Taylor series of a whole expression by working up from the leaves.

The arithmetic functions are easiest. Given \(f = a_0 + a_1 x + a_2 x^2 + \dotsb\) and \(g = b_0 + b_1 x + b_2 x^2 + \dotsb\), what is \(f + g\)? Ignore issues of convergence and pretend we’re working with formal power series; then we can just rearrange terms to get

\begin{align*} f + g &= (a_0 + a_1 x + a_2 x^2 + \dotsb) + (b_0 + b_1 x + b_2 x^2 + \dotsb) \\ &= (a_0 + b_0) + (a_1 + b_1) x + (a_2 + b_2) x^2 + \dotsb. \end{align*}

Multiplication (\(f \cdot g\)) is a bit more complex, since we can get an \(x^n\) term in the resulting power series by multiplying \(a_0 \cdot b_n x^n\), or \(a_1 x \cdot b_{n-1} x^{n-1}\), or any other combination of terms whose powers sum to \(n\). Division involves finding \(h = f / g\), which is best done by instead solving the linear equations of \(h \cdot g = f\).

Exponential functions

But the nicest trick for taking Taylor expansions is the way Herbie handles exponentials and trigonometric functions. Suppose you want to find the Taylor expansion of \[ h = e^{a_0 + a_1 x + a_2 x^2 + \dotsb}. \] Rewrite this into an infinite product: \[ h = e^{a_0} e^{a_1 x} e^{a_2 x^2} \dotsb. \] Each of the terms in this product have easy Taylor expansions, since the expansion for \(e^x\) is well-known. Substituting in those expansions gives

\begin{align*} h &= e^{a_0} \\ & \times (1 + a_1 x + \frac12 a_1^2 x^2 + \dotsb) \\ & \times (1 + a_2 x^2 + \frac12 a_2^2 x^4 + \dotsb) \\ & \vdots \end{align*}

This is still an infinite product, so it might not look like any progress has been made, but in fact we’re almost done. The key is that for any coefficient you want, say \(x^i\), only the first \(i\) terms of this infinite product can contribute any terms. Later series don’t have any non-constant terms with powers less than \(x^{i+1}\), so they must contribute only the constant \(1\) term to the product, and can be ignored.

What do those first \(i\) terms contribute? Looking at the first few cases suggests a pattern:

\begin{align*} h &= e^{a_0} (1 + a_1 x + \left(\frac12 a_1^2 + a_2\right) x^2 + \left(\frac16 a_1^3 + a_1 a_2\right) x^3 \\ &+\, \left(\frac1{24} a_1^4 + \frac12 a_1^2 \frac12 a_2^2 + a_1 a_3 + \frac12 a_2^2\right) x^4 + \dotsb \end{align*}

Each \(x^n\) coefficient is sum of terms, which are each products of \(a_l^k / k!\). For each product, the pairs of \(l\) and \(k\) form a partition of \(n\), a way of writing \(n\) as the sum of products \(k_1 l_1 + k_2 l_2 + \dotsb\), where each \(l\) is distinct.

This gives us our final formula: \[ \exp(a_0 + a_1 x + a_2 x^2 + \dotsb) = \sum_n e^{a_0} x^n \sum_{\lambda \vdash n} \prod_{k \cdot l \in \lambda} \frac{a_l^k}{k!}, \] where the notation \(\lambda \vdash n\) means that \(\lambda\) is a partition of \(n\). Finding the partitions of a number is easy to do recursively, and it can be done once and cached, so this formula allows quickly taking Taylor series of exponentials.

Taylor series of trigonometric functions

This trick for exponentials can be leveraged to find Taylor expansions for sines and cosines. Recall that \[\cos(x) = \Re(e^{ix}) \text{ and } \sin(x) = \Im(e^{ix}). \] Then since we can take Taylor expansions of exponentials, we can do the same for sines and cosines.

Let’s look at cosines, since sines are largely identical. \(e^{ix}\), according to the formula we just derived, is \[ \exp(ia_0 + ia_1 x + ia_2 x^2 + \dotsb) = \sum_n e^{ia_0} x^n \sum_{\lambda \vdash n} \prod_{k \cdot l \in \lambda} i^k \frac{a_l^k}{k!}. \] Getting the real part of this series is easy if \(e^{i a_0}\) is real, so let’s assume for now that \(a_0 = 0\). Then we just ignore the terms in that inner sum where the sum of all the \(k\) is odd, and flip the signs of terms where the same sum of the \(k\) is divisible by 2 but not 4.

One final trick is to avoid the assumption \(a_0 = 0\). Recall the identity \(\cos(a_0 + y) = \cos(a_0) \cos(y) - \sin(a_0) \sin(y)\); just stick the rest of the Taylor series in for \(y\). Both \(\cos(y)\) and \(\sin(y)\) are now easy to take, since there is no constant term, and \(\cos(a_0)\) and \(\sin(a_0)\) are constants. So this lets you compute Taylor expansions of arbitrary combinations of arithmetic, trigonometric, and exponential functions.

Still open

I’m struggling to find a similar trick for logarithms, which would also let me do powers using the formula \(a^b = \exp(b \log(a))\). Looks like I’ll implement a very special-purpose symbolic differentiator to handle this case. If you have any ideas, shoot.

I’m really happy with this trick for exponential and trigonometric functions, not just because it let me implement a more efficient algorithm in Herbie, but also because it’s a neat trick. It’s often hard to explain to people without mathematical training why math is so much fun. And I’ve found that math is a lot like reading a good book. As you go, you see broad parallels, deep and subtle patterns, and analogies between different characters; but also these small tricks, these neat problems, these fantastic turns of phrase. Since I’m not a mathematician, I don’t really have the time to get deeply into the second—but my how wonderful those turns of phrase are. This trick isn’t widely applicable outside this problem. But who cares? There it is, and it is beautiful.