# Quadrature in Bash

A long time ago, I was taking numerical analysis class where the instructor allowed us to use any language. Mine were all over the map^{1}^{1} looks like I used C, Go, JavaScript, OCaml, Python, Perl, Factor, Julia, J, SmallTalk, Fortran 2003, and Bash; the selection was limited to languages that supported floating point and had a collection of mathematical functions implemented., but my favorite was the quadrature algorithm I wrote in Bash.

## Quadrature

Quadrature is what numerical analysis call integrating a function.^{2}^{2} Integration is their word for solving a differential equation. There are many algorithms for quadrature,^{3}^{3} I would implement Clenshaw-Curtis quadrature if I had to pick. but the point of this assignment was to implement Newton-Cotes quadrature. This involve approximating the integral by:

- Splitting the domain of integration into \(N\) pieces,
- Fitting an $n$-th order polynomial to each piece, and
- Summing the analytic integral of each piece.

In practice, to evaluate \(\int f\), these steps evaluate \(f\) at evenly-spaced points and sum the results multiplied by weights. For example, Simpson's rule quadrature samples, for each interval \([0, 1]\) that the domain of integration is split into, evaluates \(\frac16 f(0) + \frac23 f(\frac12) + \frac16 f(1)\),^{4}^{4} Note that each interval shares endpoints with the next and previous intervals, so in practice you evaluate \[\frac16 f(0) + \frac23 f(\frac12) + \frac13 f(1) + \frac23 f(\frac32) + \frac13 f(2) + \dotsb\] and the points evaluated and the weights differ for different quadrature algorithms. So an integration algorithm naturally decomposes into a few steps:

- Generate the points to evaluate \(f\) on, and evaluate \(f\)
- Generate the weights and multiply
- Sum the results

## Multiple rules in Bash

One of my goals was to support multiple Newton-Cotes rules in one program.^{5}^{5} Because the assignment asked for results from several rules. That is, I want to source my Bash file and run:

integrate "sin(X*X) / sqrt(X*X + 1)" -n 10001 -r trap -s recursive --rhomberg

This integrates \(\sin x^2 / \sqrt{x^2 + 1}\) using 10001 points with the trapezoid rule, using pairwise summation and rhomberg integration to improve accuracy.

The first step here is to support multiple Newton-Cotes rules. My implementation had three: rectangular (order 1), trapezoidal (order 2), and Simpson's rule (order 4). I went with an "object-oriented" approach for defining these in Bash. For example, for Simpson's rule I defined `simp_points`

, `simp_weights`

, and `simp_order`

, where `simp_points`

picked out points to evaluate \(f\) on, `simp_weights`

chose the weights for each point, and `simp_order`

was a constant for the order of the rule. If the bash variable `$RULE`

stored the name of the rule being used, I could use `${RULE}_points`

and similar to call the appropriate function.

For the rectangular rule, which evaluates on each interval \(f(0)\), these definitions are:

rect_order=1 rect_points() { seq 0 `calc "1/$1"` 1 | head -n-1 } rect_weights() { WEIGHT=`calc 1/$1` for i in `seq 1 $1`; do echo $WEIGHT done }

The rectangular rule is first-order method, hence `rect_order`

. The domain of integration is fixed to \([0, 1]\) in my program, so `rect_points`

takes \(N\) as an argument and outputs that many points between 0 and 1. Note that, for example, `seq 0 0.25 1`

prints out five numbers, so I threw in a `head`

call to strip off the last one; a call to `head`

with `-n-$X`

prints all but the last `$X`

lines.

Finally, for outputting the weights, I want to output \(1/N\) for each point, corresponding to each of \(N\) intervals having a weight of \(1\) on their point. I first compute the weight \(1/N\) and store it in a variable, then print it \(N\) times. The `calc`

command I use to compute \(1/N\) is

calc() { echo "$@; exit" | bc -lq | head -n1 }

This little thing passes its arguments to `bc`

to evaluate, then reads out the answer.

The code for the trapezoidal and Simpson's rules is similar, but gets a little tricker as there are multiple weights. For example, the program that outputs weights for Simpson's rule is

N=$1 W1=`calc 1/6/$N` W2=`calc 2/6/$N` W4=`calc 4/6/$N` echo $W1 for i in `seq 2 2 $[N-2]`; do echo $W4 echo $W2 done echo $W4 echo $W1

## Basic Quadrature

Now that I have various rules implemented, I can do a first cut of integration. I have to generate points and weights, evaluate the program on the points and multiply by weights, and sum.

To generate points and weights,

paste <(${RULE}_weights $N) <(${RULE}_points $N)

I'm using the "object-orientation" trick from above, the `paste`

command (which takes two files and concatenates the first line from each, then the second lines from each, and so on), and Bash's pipe redirection syntax: `<(cmd)`

as an argument creates a named pipe, pipes `cmd`

into it, and gives the name of that pipe as an argument. You can think of the above as

${RULE}_weights $N > /tmp/a ${RULE}_points $N > /tmp/b paste a b

But the pipe redirection trick doesn't involve generating your own names, or remembering to clean them up, and lets you use the streaming features of pipes.

Once the points and weights are generated, I have to evaluate \(f\) on the points and multiply by the weights. I'll do that by running `awk`

to process each line by itself. My integrator takes an expression in `X`

as an argument; I want to turn that into `awk`

code.

```
awk -e "{ printf \"%0.20f\\n\", (\$1 * ${EXPR//X/\$2}) }"
```

`${EXPR//X/\$2}`

replaces each instance of `X`

by `$2`

(which will be bound to the point). Running this `awk`

program on the output of the `paste`

command evaluates the expression on `$2`

, the point, and multiplies by `$1`

, the weight, turning each "weight, point" line into a line with the single float answer.

Finally, these values need to be summed. The easiest way to do this is a simple `awk`

script:

```
awk -e 'BEGIN {sum = 0} { sum += $1 } END { printf "%.20f\n", sum }'
```

Gluing these steps together yields a very slow but pretty accurate quadrature algorithm.

## Recursive summation

However, there's a pernicious source of error that I needed to eliminate. Adding floating point numbers one after another, like I did above, can be inaccurate when summing a lot of numbers. That blog post advocates Kahan summation, which is really good for being very accurate, but my assignment suggested pairwise summation, which is where to sum \(N\) numbers you add every adjacent pair (reducing the original list to \(N/2\) numbers), and then recursively do the same until you have one number left.

Doing this recursively in Bash is actually quite a pain! To start, I had to do the body of the loop: pair numbers and sum them.

pairer() { sed '$!N;s/\n/ /' } sum_pairs() { pairer | awk -e '{ printf "%0.20f\n", ($1 + $2) }' }

The `sed`

script pairs numbers by replacing every other newline with a space. At least, I think that's what it does. To be honest, I can no longer read and understand that line of code. The `sum_pairs`

function just uses an `awk`

script to sum each pair.
Pairwise summation is close at hand!

Proper pairwise summation is a recursive algorithm: if you have one number, it is your sum; otherwise, sum pairs and recurse. Doing this test in Bash is a little hard: you need to read the pipe, and recurse only if there are at least two lines. I'd wanted to get away without using any temporary files, but I didn't find a way to do that, so the code looks like

TMP=`mktemp` cat >$TMP if [ `wc -l < $TMP` -eq 1 ]; then cat $TMP rm $TMP else cat $TMP | sum_pairs | recursive_summer rm $TMP fi

I write the input lines to a temporary file and count the lines. If there's one line, I just print it; otherwise I print the file, sum pairs, and recurse.

## Romberg Integration

Romberg integration is improves the accuracy of a quadrature routine by evaluating the same rule with \(N\) and \(2N\) points, and extrapolating the results. If an integration rule produces answer \(s(N) = a + c N^{-k} + \dotsb\), where \(a\) is the integral's true value, then \((s(N) - 2^k s(2N)) / (1 - 2^k)\) cancels the two \(c N^{-k}\) terms leaving just terms of lower order. So, it gives a more accurate integration rule. I implement this with

VAR=${RULE}_order O=${!VAR} r1=`integrate $[N / 3] $problem $rule $summer` r2=`integrate $[2 * N / 3] $problem $rule $summer` calc "($r1 - 2^$O * $r2) / (1 - 2^$O)"

First, I dereference the variable `${RULE}_order`

, then integrate with \(N/3\) and \(2N/3\) points (so that in total I evaluate the function \(N\) times), and finally do the extrapolation.

^{1}

looks like I used C, Go, JavaScript, OCaml, Python, Perl, Factor, Julia, J, SmallTalk, Fortran 2003, and Bash; the selection was limited to languages that supported floating point and had a collection of mathematical functions implemented.

^{2}

Integration is their word for solving a differential equation.

^{3}

I would implement Clenshaw-Curtis quadrature if I had to pick.

^{4}

Note that each interval shares endpoints with the next and previous intervals, so in practice you evaluate \[\frac16 f(0) + \frac23 f(\frac12) + \frac13 f(1) + \frac23 f(\frac32) + \frac13 f(2) + \dotsb\]

^{5}

Because the assignment asked for results from several rules.