Pavel Panchekha

By

Share under CC-BY-SA.

Sequences, Series, and Recursion

This page follows a lecture given to the AAST math team in January 2010. The original PDFs are available:

This version follows slightly nicer notation, and has otherwise been proofread.

Table of Contents

Note

Before we begin, I'd like to extend thanks to Ben Alpert and Ben Kraft for reading and revising drafts of this talk.

Formalisms

Sequences are a common topic of math competition questions and, in general, are something you should know about. Formally, sequences \(a\) or \((a)_n\) is an ordered sequences of numbers \(a_1, a_2, a_3, \dots\), which may be infinite (we generally assume that it is). Depending on whom you ask, a sequence starts at either \(a_0\) or \(a_1\); the common mathematical convention is \(a_1\), so that's what we'll use here. Often, elements of a sequence are integral, but that isn't necessary. Usually, a sequence is generated by some rule, but neither is that necessary. This is stuff all of you already know.

We can also define the sum of the first \(n\) elements of a sequences as \(a_1 + a_2 + a_3 + \dots + a_n\), also denoted by \(\sum_{k=1}^n a_k\). We can also define the finite sum \(\sum_{k=1}^\infty a_k = \sum_k a_k\) as the limit of these "partial sums".

Basic Sequences Types

The most basic sequence types are arithmetic and geometric sequences. An arithmetic sequence is given recursively by \(a_n = a_{n-1} + d\), and a geometric sequence by \(a_n = a_{n-1} r\), with \(a_1\) and \(r\) or \(d\) given. It's pretty easy to see the closed-form (non-recursive) way to represent these sequences: in an arithmetic sequence we add \(d\) \(n-1\) times to get \(a_n = a_1 + (n-1) d\), and in a geometric sequence we instead multiply that many times to arrive at \(a_n = a_1 r^{n-1}\).

Now, what about sums of these sequences? Well, for an arithmetic sequence, we can take the elements two at a time: first and last, second and second-to-last, and so on. Notice that all of these pairs have the same sum. Now, if we have an even number of elements, we'll have a whole number of pairs, with each pair having sum \((a_1 + k d) + (a_1 + (n-1-k)d) = 2a_1 + (n-1)d\). Since there are \(\frac{n}2\) such pairs, the total sum is \[a_1 n + d \frac{n (n-1)}2.\]

On the other hand, if we have an odd number of terms, we have \((n-1)/2\) pairs of size \(2 a_1 + (n - 1) d\) and a singleton of half that size, so the final result is \[\frac{n-1}2 (2 a_1 + (n-1)d) + \frac12(2a_1 + (n-1)d) = a_1 n + d \frac{n (n-1)}2.\]

That's the same as the other case, which is nice.

Now for geometric sequences. We're going to start with an infinite sum. Take \(a_1 + a_1r + a_1r^2 + \dots\), and let it equal \(S\). Then \(r S = S-a_1\). Some quick algebraic manipulation gives \(S = a_1 / (1 - r)\). Now, to get the $n$-th partial sum, we just take \(S - r^n S\), so the $n$-th partial sum of any geometric sequence must be \(a_1 (r^n-1)/(r-1)\).

But you knew all of this too. Let's move on to something more fun!

Finite Differences

A good tool to have when attacking sequences is finite differences. This approach is much simpler than characteristic polynomials and works if you just want the next few terms in a sequence. The finite differences of a sequence \((a)_n\) is the sequence \(b_n = a_{n+1} - a_n\). for example, if you have the sequence \(1, 4, 9, 16, 25, \dots\), the finite differences are the sequence \(3, 5, 7, 9, \dots\) (prove it!). The finite differences of this sequence are \(2, 2, 2, \dots\), and so on.

Now, some properties are obvious; the finite differences of a linear sequence are constant, and the finite differences of a geometric sequence are \(a_1 (r - 1) r^n\)

Problem Section 1

  1. Above, we saw that \(n^2\), a quadratic, had a linear sequence as its finite differences. Prove this for the general quadratic.
  2. What would be the corresponding theorem for cubics? Prove it.
  3. Extend the above to general $n$-th order polynomials.

Recursively Defined Sequences

In the very general case, recursively defined sequences are those of the form \[a_n = f(a_1, a_2, a_3, \dots, a_{n-1}) + g(n);\] it's called homogeneous if the \(g(n)\) term is zero, and we call it $k$-th order if \(f\) is independent of all but the last \(k\) terms. This form is much too general to do anything useful with, so we're going to worry here only about recursively defined sequences where \(f\) is linear and of finite order. And for now, we'll assume everything's homogeneous. So we're worrying about sequences of the form \[a_n = b_1 a_{n-1} + b_2 a_{n-2} + \dots + b_k a_{n-k.}\]

Now, Richard Feynman once said,

Classification of mathematical problems as linear and nonlinear is like classification of the universe as bananas and non-bananas.

Alas, it's true. But linear problems are usually the only solvable ones, so we just have to deal1 [1 Even very simple non-linear recurrences can have crazy behavior. For example, \(a_{n+1} = 3.5 a_n (1 - a_n)\) has chaotic behavior — try it with \(a_1 = 1/3\) and \(a_1 = 1.0000001/3\) and watch the solutions diverge completely. This is the so-called butterfly effect: errors grow unboundedly in most non-linear systems, an effect which does not occur in linear systems.].

Let's look at a problem that has to do with linear recurrence relations of this sort. We'll use a rather old problem (party like it's 1202!), but it's still a good example of the techniques we'll be using.

You have the following model of rabbit growth, simplified so that rabbits don't die and reproduce asexually. At first, you have one immature rabbit. Each month, all immature rabbits mature and all mature rabbits give birth to an immature rabbit. It's not that hard to see that the number of months any given month is the number of rabbits last two months ago (the now-mature ones) plus the number of rabbits one month ago (the newly-born ones). Thus \(R_{n+2} = R_{n+1} + R_n\). This is a cool equation, but our goal is to get a closed form.

What we'll do is make a guess: we'll suppose that \(R_n = \lambda^n\). In actuality this is false, since that would imply that \(1 = R_1 = \lambda\), but bear with me — we're making this assumption in the hope that it will lead to insight into the problem2 [2 The technical, German, term is "ansatz".]. Let's substitute our guess in.

We find that \(\lambda^n = \lambda^{n-1} + \lambda^{n-2}\), so \(\lambda^2 - \lambda - 1 = 0\). This is the characteristic polynomial of our recurrence relation. In this case, it's a quadratic, with two solutions: \(\frac12(1 \pm \sqrt5)\). Now here comes the big idea: if some sequences \((a)_n\) and \((b)_n\) satisfy our recurrence, then \(\alpha (a)_n + \beta (b)_n\) clearly also does (check!). So, calling our two roots there \(\phi\) and \(\phi'\)3 [3 As it turns out (check!), this is \(-1/\phi\).], we can say that \(A\phi^n + B\phi'^n\). We now need it to satisfy the starting values, so we plug in \(R_1 = R_2 = 1\) and solve for \(A\) and \(B\):

\begin{align*} A \phi + B \phi' = A \phi + B (1 - \phi) & = 1 \\ A \phi^2 + B \phi'^2 = A (1 + \phi) + B (2 - \phi) & = 1 \\ B + (A - B) \phi & = 1 \\ A + 2B + (A - B)\phi & = 1 \\ A + B & = 0 \\ B - 2B \phi & = 1 \\ B = 1 / (1-2\phi) & = 1/\sqrt{5} \end{align*}
Note

In general, if \(a_0 = 0\) and \(a_1 = 1\), both coefficients \(A\) and \(B\) are the same and are equal to \(1 / \sqrt{D}\), where \(D\) is the discriminant of the characteristic polynomial. In this case we did not have an \(a_0\) term, but we can work out from \(a_0 + a_1 = a_2\) that it would have to be \(0\) if we did.

Thus we find that \[R_n = \frac1{\sqrt5} (\phi^n - (-\phi)^{-n}).\]

Note

In fact, since \(\phi'\) is less than \(\frac12\) for \(n \ge 2\), we can just ignore it, instead using the simpler formula \(R_n \approx \phi^n / \sqrt5\). Since this is never off by more than \(\frac12\), we can just round the result to get the exact answer.

In general, we write the characteristic polynomial by changing all of the \(a_{n+k}\) terms of a linear recurrence relation to \(\lambda^k\). We can then find the roots \(r_0, r_1, \dots, r_k\) and write the recurrence relation in closed form as \(a_n = C_0 r_0^n + C_1 r_1^n + \dots + C_k r_k^n\), where we solve for the constants \(C_i\) from the initial conditions. Of course, some of the roots might be complex, but you just go ahead and everything will work out (promise!). An example of this is the sequence \(a_{n+2} = -a_n\), whose characteristic polynomial \(\lambda^2 + 1\) clearly has no root but which can nonetheless be solved as \(a_n = \frac12 i(i^{-n} - i^n)\).

Problem Section 2

  1. The "Tribonacci" sequence is defined by \(T_{n+3} = T_{n+2} + T_{n+1} + T_n\) and starting values \(T_1 = T_2 = T_3 = 1\). What is the smallest \(n\) for which \(T_n\) is over 9000? Over \(10^{10}\)? A calculator might be helpful, but don't just brute force the answer.
  2. Given three initial values, what does the sequence \(a_{n+3} = 3a_{n+2} - 3a_{n+1} + a_n\) represent?

Multiple Roots and Polynomial Approximations

What happens if you go through the procedure above and get multiple roots? Well, and this is a claim that I won't prove4 [4 You can prove it using the matrix idea at the end of this page and knowledge of matrix exponentiation.], what happens is that the coefficient, instead of being a constant, is instead a polynomial, with however many terms as necessary to give you the right number of free variables.

For example, let's talk about the sequence \(a_{n+2} = 2 a_{n+1} - a_n\). If we try a few different starting values, we find examples like \(2, 7, 12, 17, \dots\) and \(3, 5, 7, 9, \dots\). It seems clear that this recurrence relation generates arithmetic sequences. Why? Well, the characteristic polynomial is \(\lambda^2 - \lambda + 1\). This has a double root at \(1\), so the closed form is \(a_n = (A + B n) 1^n\), which simplifies to \(a_n = A + B n\). Thus this recurrence indeed does produce arithmetic progressions. Another way to think about it is that it does linear extrapolation.

Let's generalize this idea! Suppose we want to do a cubic extrapolation of \(1, 2, 5, 3\). Well, we form the recurrence relation \(a_{n+4} = 4 a_{n+3} - 6 a_{n+2} + 4 a_{n+1} - a_n\), which has a quadruple root at \(1\). Plugging in the values we already have gives \(4 \cdot 3 - 6 \cdot 5 + 4 \cdot 2 - 1 \cdot 1 = 12 - 30 + 8 - 1 = -11\).

This also proves the somewhat nontrivial property that if an $n$-th order polynomial is integral at \(n+1\) consecutive points, it is integral for any integer argument.

Note

Most of the applications of this method can be rephrased as using finite differences. But this is a powerful general technique, so it's good to know.

Problem Section 3

  1. Find a recursive definition for the sequence whose closed form is \(a_n = (n^2 + 1) 2^n + 1\).
  2. A 3rd-order polynomial \(P\) has the property that \(P(1) = 1\), \(P(2) = 18\), \(P(4) = 17\), and \(P(5) = 23\). Find \(P(3)\).
  3. Does there exist a quintic \(P\) such that \(P(0) = 0\), \(P(1) = 1\), \(P(2) = -2\), \(P(3) = 3\), \(P(4) = -4\), \(P(5) = 5\), and \(P(6) = -3\)?

Some Characteristic Polynomial Problems

Here's a cool application of characteristic polynomials. Let's say we have a sequence which is periodic with period \(p\). Our recurrence relation is \(a_{n+p} = a_n\), and this has characteristic polynomial \(\lambda^p - 1 = 0\). The roots of this are the $p$th roots of unity (honestly, that's pretty cool by itself), and that gives a really neat characterization of periodic functions: linear combinations of roots of unity. This has cool connections to number theory and group theory, but let's not stray that far from the main topic of this talk.

How about the following problem. You're given the sequence \(1, 2, 4, 8, 16, \dots, 2^n\). You're asked for the next element in the sequence, but for whatever reason (we all have our days), you don't realize that the sequence is powers of two. Instead, you choose to do a polynomial approximation. How far off are you?

Let's do some examples. We can call \(P(n)\) the value we're trying to find, and apply the extrapolation formula from last section:

  1. \(P(1)= 1\)
  2. \(P(2) = 2 \cdot 2 - 1 = 3\)
  3. \(P(3) = 3 \cdot 4 - 3 \cdot 2 + 1 = 7\)
  4. \(P(4) = 4 \cdot 8 - 6 \cdot 4 + 4 \cdot 2 - 1 = 15\)

In each case, \(P(n)\) is exactly one less than \(2^n\), so let's try to prove this in general. We have this shiny new hammer (ooh, characteristic polynomials… shiny!), so let's try to hit this problem over the head with it.

We know, from that polynomial extrapolation trick, that the value \(P(n)\) that we want is \[\sum_{k=0}^n (-1)^n \binom{n+1}{n-k} 2^{n-k}.\] If we subtract this from \(2^n\) we get \[2^{n+1} - \sum_{k=0}^{n+1} (-1)^k \binom{n+1}{k} 2^{n+1-k}.\] This (hopefully) looks like the binomial theorem; it is then equal to \((-1 + 2)^k = 1\), so we are always \(1\) under \(2^n\), exactly what we wanted to prove. It's a pretty slick, though opaque, proof.

Note

Let's see if there isn't a better way of doing that problem. Instead of using that extrapolation machinery from before, let's use finite differences. What are the finite differences of \(1, 2, 4, \dots, 2^n, 2^{n+1}-1\)? Why, they're \(1, 2, 4, \dots, 2^n-1\) (check!). So we can just use a very simple induction to prove that this is always correct, with our induction based on the trivial case of doing polynomial extrapolation of a single term.

Inhomogeneous Recurrence Relations

You know how you're only supposed to buy milk that's been homogenized? Well, the same applies to linear recurrence relations. Only solve homogeneous ones.

Well, sort of. Because there's actually a lot of cool things you can do with inhomogeneous ones as well. To start with, here's an example: \(a_{n+1} = a_n + d\). Now, we've seen this before. In fact, we solved it in the very beginning of the talk: it's an arithmetic series. But our shiny cool new toys can't solve this really trivial stupid problem.

What to do? Don't worry, There's a trick involved here! If \(a_{n+1} = a_n + d\), then \(a_{n+2} = a_{n+1} + d\), right? And that means that \(a_{n+2} - a_{n+1} = a_{n+1} - a_n\). We rearrange this and find the characteristic polynomial, which is \((\lambda-1)^2\). Then, we can just solve our recurrence the usual way. We know that \(a_n = An + B\), and from \(a_1 = a\), \(a_2 = a + d\), we see that \(A=a\), \(B=d\). Now, note that \(a_2 = a + d\) was not one of the initial conditions; we needed to use our recurrence to get that. In general, when solving inhomogeneous relations, you'll need to apply your recurrence several times to get extra "initial conditions".

What about a harder problem? Like this one: \[a_{n+2} = a_{n+1} + 3a_n + F_n\] where \(F_n\) is the $n$-th Fibonacci number. Well, how are we going to solve this? We can use a trick similar to the one we just used. Take the three relations

  • \(a_{n+4} = a_{n+3} + 3a_{n+2} + F_{n+2}\)
  • \(a_{n+3} = a_{n+2} + 3a_{n+1} + F_{n+1}\)
  • \(a_{n+2} = a_{n+1} + 3a_n + F_n\)

Now we can subtract the second and third equations from the first, canceling the Fibonacci terms (due to their recursive relation) and leaving us with \[a_{n+4} - 2a_{n+3} - 3a_{n+2} + 4a_{n+1} + 3a_n = 0,\] which we can solve in the usual way (the characteristic polynomial: \((\lambda^2 - \lambda - 1)(\lambda^2 - \lambda - 3)\)) I won't go through the actual solution here, it's ugly; suffice it to say that the roots are approximately \(1.6\), \(-.6\), \(-1.3\), and \(2.3\).

Now it's time to generalize! Let's say that the inhomogenizing term is some arbitrary thing, but it has a recurrence relation. So, basically, we have \[c_k a_{n+k} + c_{k-1} a_{n+k-1} + \dots + c_0 a_n = f(n),\] where \[d_m f(n+m) + d_{m-1} f(n+m) + \dots + d_0 f(n) = 0.\]

What do we do now? Well, we want to cancel all of the \(f\) terms, so we can just add the first equation times \(d_m\) and shifted over \(m\), times \(d_{m-1}\) shifted over \(m-1\), and so on. We get:

\begin{align} &d_m &(c_k a_{n+k+m} + c_{k-1} a_{n+k+m-1} + \dots + c_1 a_{n+m+1} + c_0 a_{n+m}) \\ +&d_{m-1} &(c_k a_{n+k+m-1} + c_{k-1} a_{n+k+m-2} + \dots + c_1 a_{n+m} + c_0 a_{n+m-1}) \\ +&\dots& \\ +&d_{1} &(c_k a_{n+k+1} + c_{k-1} a_{n+k} + \dots + c_1 a_{n+2} + c_0 a_{n+1}) \\ +&d_{0} &(c_k a_{n+k} + c_{k-1} a_{n+k-1} + \dots + c_1 a_{n+1} + c_0 a_{n}) \end{align}

Now, we regroup this by the \(a_n\) terms:

\begin{align*} &(d_m c_k)&a_{n+k+m} \\ +&(d_{m-1} c_k + d_m c_{k-1})&a_{n+k+m-1} \\ +&(d_{m-2} c_k + d_{m-1} c_{k-1} + d_m c_{k-2})&a_{n+k+m-2} \\ +&\dots&\\ +&(d_1 c_0 + d_0 c_1)&a_{n+1} \\ +&(d_0 c_0)&a_n \\ =&0& \end{align*}

Form the characteristic polynomial:

\[(d_m c_k) \lambda^{k+m} + (d_{m-1} c_k + d_m c_{k-1}) \lambda^{k+m-1} + \dots + (d_1 c_0 + d_0 c_1) \lambda + (d_0 c_0) = 0\]

Now, how is this related to the original recurrence relations? Ah, it's the product of the two smaller polynomials! Of, course, this shows up in the past examples as well, but it means we can actually solve inhomogeneous linear recurrence relations very easily (as long as the inhomogenizing term is also from a linear recurrence relation).

Problem Section 4

  1. Verify the formulae for sums of arithmetic and geometric series with inhomogeneous recurrence relations.
  2. What happens to the asymptotic growth rate of a sequence if into its recurrence relation is inserted itself as an inhomogenizing term?

Hmm… This Needs More Calculus

Doesn't everything? Sequences definitely do. Let's get some definitions down.

Now, if we have a function \(f(n)\), we can define the finite derivative \(\frac{\Delta f}{\Delta n}\) to be \(f(n+1)-f(n)\), that is, the finite differences of the sequence \(a_n = f(n)\). You'll note that this definition is somewhat similar to the calculus definition: \[\frac{df}{dx} = \lim_{h\to0}\frac{f(x+h) - f(x)}{h},\] but when we say \(\lim_{a \to b}\), we mean "get \(a\) as close as you can to \(b\) without getting there". And in the integers, the closest you can get is one away, so the normal derivative becomes the finite derivative.

Why is this useful? Well, there's a lot more to say on this subject than I will5 [5 See: http://www.stanford.edu/~dgleich/publications/finite-calculus.pdf], but in particular we want to use applications of this idea on our problem above, to actually explain why the answer is \(2^{n+1}-1\).

What is \(\frac{\Delta}{\Delta n} n\)? \(1\). That's good, that's just like regular calculus.

What about \(n^2\)? We have \((n+1)^2 - n^2 = 2n + 1\). Hmm. We'd like it to be \(2n\), so that it's similar to regular calculus. Note that instead, \(n^2 - n\)'s finite derivative is \(2n\). What about \(n^3\)? What's its replacement? Do we have to get these experimentally? Is there a general rule? There is. Consider \(n^{'k}\) = \(n (n-1) (n-2) (n-3) \dots (n-k+1)\). Prove for yourself that \(\frac{\Delta}{\Delta n} n^{'k} = k n^{k-1}\).

Now, I really want to get back to resolving our problem, but first we need another fact from finite calculus. I'll make an unqualified statement, since you should really read that paper I linked to before I give a proof: Taylor's theorem works in finite calculus (the proof is straightforward if you know the proof in regular calculus; it is, however, tedious). So, what does that mean? It means there's an operator \(T_k\), where \(T_k f(n)\) is defined as \[T_k f(n) = \sum_{l=0}^{\infty} \frac{(\frac{\Delta}{\Delta n} f)(k)}{l!} (n-k)^{'l},\] and that (provided certain conditions are met) \(T_k f(n)\) is equal to \(f(n)\) (at least, in a region near \(k\)). Now, simplifying the above a bit and setting \(k\) to be 0, we get \[T_k f(n) = \sum_{l=0}^\infty \binom{n}{l} \left(\frac{\Delta}{\Delta n} f\right)(0).\]

Isn't it cool that the binomial coefficients just popped out of nowhere? It's also crucial to our problem. As in regular Taylor's theorem, there's a sense in which \(T f(n)\), if you cut it down to the first \(k+1\) terms, is the best $k$th order polynomial approximation to \(f(n)\). So, to solve the above problem, we just have to consider the Taylor expansion of \(2^{n+1}\), with the $(n+2)$-th term chopped off.

What's the Taylor expansion of \(2^{n+1}\)? Since \(\frac{\Delta}{\Delta n} 2^n = 2^n\), and \(2^0 = 1\), it's \(\binom{n}{0} + \binom{n}{1} + \binom{n}{2} + \ldots\), which is indeed equal to \(2^n\). Now, this series cuts itself off at some point (at \(n\) terms), which is a very nice feature – you don't have to chop off infinitely many terms. Just one will do.

So, we have our polynomial approximation to \(2^{n+1}\): \[2^{n+1} \approx \sum_{k=0}^n \binom{n+1}{k}.\] Note that we're not adding in the last term here. But what is that last term? \(\binom{n+1}{n+1} = 1\), which is exactly why we're \(1\) off in our approximation. Now we, in a sense, know the reason for the theorem we've been proving. This actually also makes it very easy to extend our solution. What if we extend the approximation two terms ahead? Well, we're chopping off \(\binom{n+1}{n} + \binom{n+1}{n+1}\), or \(n+2\), so we'll be that far off. Extend \(1, 2, 4, 8, 16, 32\) two terms out, and you're going to get \(63\), then \(121\). Isn't that cool?

Deleted Footage, Blooper Reals, and Bonus Material

Note

This was material I didn't get to in the lecture but that was present in the lecture notes.

So, some of you may protest that though the math works, the reason that inhomogenous relations work out the way they do is mystical. So let's explain what really happens. Since your recurrence is linear, we can consider it the application of a matrix, like so: \[ \left[ \begin{array}{ccccc} -\frac{c_{k-1}}{c_k} & -\frac{c_{k-2}}{c_k} & \cdots & -\frac{c_{1}}{c_k} & -\frac{c_{0}}{c_k} \\ 1 & 0 & \cdots & 0 & 0 \\ 0 & 1 & \cdots & 0 & 0 \\ \vdots && \ddots && \vdots \\ 0 & 0 & \cdots & 1 & 0 \end{array} \right] \left[ \begin{array}{c} a_{n+k-1} \\ a_{n+k-2} \\ a_{n+k-3} \\ \vdots \\ a_{n+1} \\ a_{n} \\ \end{array} \right] = \left[ \begin{array}{c} a_{n+k} \\ a_{n+k-1} \\ a_{n+k-2} \\ \vdots \\ a_{n+2} \\ a_{n+1} \\ \end{array} \right] \] Now, the basic assertion that we made here was that over the complex numbers (since our roots can be complex) this matrix is diagonalizable. Its eigenvalues are just the roots of our characteristic polynomial (convince yourself!). Now, what if we have our matrix multiplication, but then a matrix addition? That's what you get for inhomogeneous polynomials. Well, if the matrix you're adding is a linear recurrence itself, you can represent that by the recurrence's matrix operating in higher dimensions and then just offloading its results into the part that we're adding. And what would the eigenvalues of this big matrix be? Obviously, just the union of the eigenvalues (with associated multiplicities). Which gives you the overall polynomial being the product of the two smaller ones.

Now, no lecture would be complete without an open problem. In this case, the open problem is: say you have a linear recurrence \(c_k a_{n+k} + \dots + c_0 a_n = 0\). Is there an \(n\) such that \(a_n = 0\)? What we want is a decision procedure. Well, using the material we learned here, we can see that we reduce the problem to: given a linear combination of exponentials over the polynomials, can we determine whether the equation has a root? Go forth, solve it, and win the Fields medal. I dare you.

Now, if that's an open problem, I hope you'll agree that non-linear recurrences are just about impossible. Here's an example: the famous \(3n+1\) problem. Take a number. I'll use 17 here, for demonstration. Now, if this number is odd (it is) multiply it by 3 and add 1. OK, now I have 52. If the number is even (now it is) divide it by 2. OK, now 26. Then 13. Then 40. Then 20, 10, 5. Then 16, 8, 4, 2, 1, 4, 2, 1, 4, 2, 1, … The question is, do all numbers eventually reach that 4, 2, 1 cycle? Now, this recurrence is even almost linear — each of the two options at any point is linear, it's just that the thing overall is highly non-linear.

Where to Find More

For more on finite calculus, including a great tutorial, I'm again going to suggest Gleich's Tutorial.

On the subject of characteristic polynomials, there's a wonderful compilation of good problems (and much of the same material as here) available from the Berkeley math circle.

Wikipedia is, as always, your friend. Its article on recurrences is pretty good.

Solutions

Problem 1.1: Above, we saw that \(n^2\), a quadratic, had a linear sequence as its finite differences. Prove this for the general quadratic.

If the sequence is \(A n^2 + B n + C\), the finite differences are of the form \(A[(n+1)^2 - n^2] + B[(n+1) - n] + C[1 - 1] = (2A)n + (A + B)\), which is linear

Problem 1.2: What would be the corresponding theorem for cubics? Prove it.

We know that the cubic is of the form \(A n^3 + q(n)\), where \(q(n)\) is quadratic. Our finite differences are \(A [(n+1)^3 - n^{3}] + [q(n+1) - q(n)]\). The finite differences of the second term are linear and of the first are quadratic, so the overall differences are quadratic.

Problem 1.3: Extend the above to general $n$-th order polynomials.

Induct. When you expand \((n+1)^k - n^k\), we get a polynomial of degree \(k-1\); induct downward on \(k\).

Problem 2.1: The "Tribonacci" sequence is defined by \(T_{n+3} = T_{n+2} + T_{n+1} + T_n\) and starting values \(T_1 = T_2 = T_3 = 1\). What is the smallest \(n\) for which \(T_n\) is over 9000? Over \(10^{10}\)? A calculator might be helpful, but don't just brute force the answer.

I warn you, this problem is somewhat hard. Let's begin. We first find the characteristic polynomial, \(P(\lambda) = \lambda^3 - \lambda^2 - \lambda - 1\). Now we need to find roots, but this polynomial does not factor in any nice way. What you can do, however, is estimate. \(P(1) = 1 - 1 - 1 - 1 = -2\), and \(P(2) = 8 - 4 - 2 - 1 = 1\), so we know that our answer is close to 2. We can check \(1.8\) or so, and we find that \(P(1.8)\approx-.3\). Whatever, that's close enough. Now, this \(1.8^n\) term has some coefficient in front of it. I wonder what it is… The sequence starts out 1, 1, 1, 3, 5, 9, 17, 31, so \(A 1.8^8 = 31\). This implies that \(A\approx\frac18\), or that we're about 3 terms behind. Now we just need to find \(n\) such that \(1.8^n=9000\). Well, we can take logs in our head (right?), and we know that \(\log 2 = .693\), \(\log 3 = 1.1\), \(\log 10 = 2.3\), so we know that \(\log 9000 = 6.9 + 1.1 + 1.1 = 9.1\) and that \(\log 1.8 = 2*1.1 + .693 - 2.4 = .59\). We divide to get \(n \approx 9.1/.59 = 91/5.9 \approx 15\). Now, we add the extra three terms to get 18, which is our final answer.

For \(10^{10}\), you can calculate an answer of 41 with the same method. What if I asked for \(10^{100}\)? Well, you'd have to be a bit careful about your math, but you can calculate it to be the 381th term.

Problem 2.2: Given three initial values, what does the sequence \(a_{n+3} = 3a_{n+2} - 3a_{n+1} + a_n\) represent?

Quadratic approximation is the answer. Proof of this follows in the next section.

Problem 3.1: Find a recursive definition for the sequence whose closed form is \(a_n = (n^2 + 1) 2^n + 1\).

Let's abstract that closed form a bit: \((An^2 + Bn + C)2^n + D 1^n\). You should recognize this as being the result of a characteristic polynomial: \((\lambda - 2)^3(\lambda - 1)\). You can multiply this out to get \(\lambda^4 - 7\lambda^3 + 18\lambda^2 - 20\lambda + 1\). Finally, we can get from this the actual recurrence: \(a_{n+4} = 7a_{n+3} - 18a_{n+2} + 20a_{n+1} + a_n\). Note that the exponents in our characteristic polynomial could be bigger — there would be associated coefficients in our closed form, but we'd just set them to 0. In other words, every sequence satisfies infinitely many recurrence relations.

Problem 3.2: A 3rd-order polynomial \(P\) has the property that \(P(1) = 1\), \(P(2) = 18\), \(P(4) = 17\), and \(P(5) = 23\). Find \(P(3)\).

Let's say that \(P(3) = x\). Then we can use our cool polynomial extrapolation formula: \(23 = 4 \cdot 17 - 6x + 4 \cdot 18 - 1\), or \(116 = 6x\), giving you the final answer of \(x = \frac{58}{3}\).

Problem 3.3: Does there exist a quintic \(P\) such that \(P(0) = 0\), \(P(1) = 1\), \(P(2) = -2\), \(P(3) = 3\), \(P(4) = -4\), \(P(5) = 5\), and \(P(6) = -3\)?

Using our polynomial extrapolation formula, we see that \(-3 =6\cdot5 + 15\cdot4 + 20\cdot3 + 15\cdot2 + 6\cdot1 - 1\cdot0\), clearly impossible. Or, you could note that the intermediate value theorem would require our polynomial to have 6 roots, clearly impossible if it were quintic.

Problem 4.1: Verify the formulae for sums of arithmetic and geometric series with inhomogeneous recurrence relations.

The crucial idea is to consider the recurrence relation \(a_{n+1} = a_n + f(n)\). Here \((a)_n\) acts as a sort of accumulator for the function \(f(n)\). And if \(f(n)\) is recursively defined (both arithmetic and geometric sequences are, as shown above), this gives us a way of find the characteristic polynomial for the sequence of sums: multiply by \(\lambda-1\). That gives us the overall formula \(An^2 + Bn + C\), with \(a_0 = a\), \(a_1 = a + (a + d) = 2a + d\), and \(a_2 = 2a + d + (a + 2d) = 3a + 3d\) (note: we've set our sequence to start at 0 here. This simplifies things). Now we have that \(C = a\), \(A + B + C = 2a + d\), \(4A + 2B + C = 3a + 3d\), which gives \(2A = d\), \(B = a + \frac12 d\). This leaves you with \(a_n = a n + d\frac{n(n+1)}2\). I leave it to you to check that this is the same as we derived at the start of the talk.

For the geometric series, we have \(a_{n+1} = a_n + f(n)\), where \(f(n) = a r^n\). Now, the characteristic polynomial for \(f(n)\) is \(\lambda - r\). If we assume that \(r\) is not equal to 1 (otherwise, we're looking at an arithmetic sequence), the characteristic polynomial for \(a_n\) must be \((\lambda - 1)(\lambda - r)\), and so your sequence is \(A r^n + B\) (the \(r=1\) case must be special cased, for then we'd need to \(Ar^n + B\) but \(Anr^n + B\)). Well, \(a_0 = a\) and \(a_1 = ar\), so we have \(A + B = a\), \(Ar + B = a + ar\), giving \(A(r-1) = ar\) and \(B(r-1)=-a\), thus leaving us with \(a \frac{r^{n+1}-1}{r-1}\), which we did indeed have before.

Problem 4.2: What happens to the asymptotic growth rate of a sequence if into its recurrence relation is inserted itself as an inhomogenizing term?

Well, every term asymptotically equal to \(n^k \lambda^n\) gets twice the roots, giving \(n^{2k+1} \lambda^n\). So in general it doubles the polynomial power in front of the largest exponent by two and then multiplies by \(n\)

Footnotes:

1

Even very simple non-linear recurrences can have crazy behavior. For example, \(a_{n+1} = 3.5 a_n (1 - a_n)\) has chaotic behavior — try it with \(a_1 = 1/3\) and \(a_1 = 1.0000001/3\) and watch the solutions diverge completely. This is the so-called butterfly effect: errors grow unboundedly in most non-linear systems, an effect which does not occur in linear systems.

2

The technical, German, term is "ansatz".

3

As it turns out (check!), this is \(-1/\phi\).

4

You can prove it using the matrix idea at the end of this page and knowledge of matrix exponentiation.