Pavel Panchekha


Share under CC-BY-SA.

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation.

Set Theory (or, Mathematics Reloaded)


This is just a quick sketch of the class, and will be filled in likely only if the class it taught again.

Logic; Why Axiomatize?

Paradoxes and the Problems of Formal Systems

Intuitive Set Theory

We want to axiomatize this notion of "set" that encompasses all of the properties we want these objects to have. So our first go at this might look like the following:

  1. Two sets are the same if an element is in one if and only if it is in the other.
  2. There exists a set with no elements, and for any object \(x\) there exist the sets \(\{x\}\) and \(\{x, y\}\).
  3. For any predicate \(\phi(x)\), there exists the set consisting of all objects that satisfy that predicate.
  4. There exists the power set of any set.
  5. There exists the union of any set of sets.
  6. For any set \(A\), there exists a set containing exactly one element of each of its subsets. (That is, we can pick one element each out of any number of sets)

This is quite a good theory of sets, but suffers from a few problems. First of all, we still fall prey to Russel's paradox, for there is nothing in our theory preventing its application. So the first thing we have to do is to fix that. If you think about the paradox, the problem is that we are testing our predicate, \(x \not\in x\), on sets that we do not yet know exist. So what we're going to do is only allow us to test our predicate in Axiom 3 on sets we know exist. Here's one way of doing that:

  1. For any predicate \(\phi(x)\) and set \(S\), there exists the subset of \(S\) consisting of all objects that satisfy that predicate.

Thus, when we try to apply our Russel's predicate to all sets, we fail, because we cannot prove the existence of the set of all sets first (in fact, we can disprove it).

So now we have a nice and consistent theory (there is a beautiful theorem stating that we cannot use this system to prove its own consistency, unless of course it is not consistent and everything is provable). But there is a glaring flaw when we try to use this system for all of math! The thing is, any proof in mathematics has to be finite — it's a bit cheap, after all, to prove something in an infinite number of steps. But the only methods we have of creating bigger sets are by taking power sets and unions. We start with sets of size 0, 1, and 2, and both power sets and unions take finite sets to finite sets. So our theory cannot ever construct an infinite sets. Not, of course, that infinite sets cannot exist. But neither can we prove that they do. And how can we have a mathematics if we can only ever have a finite number of elements?! So we want to make a further improvement to our theory, another axiom, giving us an infinite set:

  1. There exists a set, \(I\), that has the null set as an element and also contains, for any element \(x\), the element \(x \cup \{x\}\).

This is the classic recursive definition: we have a base case (the null set) and the inductive step, wherein we form, from the largest element, a yet larger element. Let's prove that \(I\) contains infinitely many elements. Suppose it did not, and instead it has elements \(\emptyset, x_1, x_2, \dots\), and let's use the abbreviation \(S x\) for \(x \cup \{x\}\). Consider \(S \emptyset, S x_1, S x_2, \dots\). All of these, by our definition of \(I\), are elements of \(I\), and there are as many of these as in our set. Thus, each \(x_i\) must be some \(S x_j\). But this is false, since \(x_j\cup\{x_j\}\) contains at least one element (\(x_j\)), and thus is not the empty set. Thus, we cannot have a finite number of elements.

There's a few more aesthetic elements that we'd like to fix in our set theory. One is that, thanks to Russel's paradox, we're a bit scared of sets that contain themselves. So, we're just going to ban it. We're going to slightly generalize that a bit:

  1. Every (nonempty) set has an element disjoint from it.

You can see how this forces there not be sets that are members of themselves: suppose \(A\) were a member of itself. Then \(B = \{A\}\) would have an element disjoint from it. But its only element is \(A\), and since \(A\) is its own element, it is not disjoint from \(B\) and our axiom fails.

Now, with this axiom, we can go back to the point about the set of all sets. That set would (obviously) contain itself, so it cannot exist. But then how do we talk about "all sets"? Later, if we have time, we'll talk about "classes", which allow us to answer this question.

Finally, we want to talk functions. But as we saw before, we can't just claim to make a set by gathering a bunch of things; after all, we gathered all sets together, but it turned out not to be a set. This upsets our intuitive notion of what a set is, but it is necessary to avoid Russel's paradox (and friends). So to talk about functions, we're going to add another axiom:

  1. Given a "function" \(f\) and a domain set \(A\), there exists a set containing \(f x\) for all \(x\) in \(A\).

Note that this set is not the image, but perhaps its superset. We can use our axiom 3 to restrict it to only the image if we need too.

Finally, we are done. We have constructed a set theory just how we like it, sufficient for all of math. This is called Zermelo-Fraenkel set theory; the axioms we'll list in a second are somewhat different from what we made above, but they are how the axioms were originally stated.

Zermelo-Fraenkel Set Theory

Zermelo-Fraenkel Set Theory (often abbreviated ZFC) is the standard axiomatization of set theory in mathematics. Its axioms:

Two sets are equal if they contain the same elements
Every nonempty set has an element disjoint from it.
Given a predicate and a set, there exists the subset of elements that satisfy that predicate.
Given two sets, there exists the set containing them.
There exists the union of any set of sets.
For any set and a function on it, the image of the set under that function exists.
Let \(S x\) abbreviate \(x \cup \{x\}\). Then there exists a set containing the null set and, for any element, its successor \(S x\).
Power Set
There exists the power set of any set.
There exists some "order" for any set. That is, there exists some predicate \(R\) that orders our set (there is a very simple definition of order) and that gives any subset a minimal element. This is equivalent to our axiom 6 above.

Now, these may seem simple and obvious, but they all contain some interesting subtleties. Look, for example, at the axiom of Extensionality. Simple, right? Just defines equality. But it tells us something crucial: everything is a set. For suppose we had something that was not a set, some \(x\). Then it contains no elements, so it contains the same elements as the null set; and thus it is the null set. In this way, we know that everything we talk about must be a set.

Or look at the axiom of Pairing. Not that, unlike in our system, we didn't axiomatically require the existence of the set \(\{a\}\). Why not? Because the axiom of extensionality tells us that the pair \(\{a, a\}\) is the same thing. Or consider ordered pairs. How can we represent \((a, b)\)? Well, we can't just use \(\{a, b\}\), because that doesn't give us order. So what we do is, we make one set \(\{a, b\}\), and another to tell use which is first: \(\{a\}\). So our final representation is \(\{\{a\}, \{a, b\}\}\) (note: the axiom of regularity actually allows us to simplify this to \(\{a, \{a, b\}\}\). Think about why.).

There are also so many cool things in the well-ordering, or power-set, or infinity axioms. The axioms look trivial. They are not.

There's another thing to note, that the axioms of replacement and specification are given in terms of a predicate. This makes them statements of "second order logic". This is generally bad (at least, aesthetically unpleasing), so you will hear them called the axiom schema of replacement and specification — we think of them as a template for an infinite number of axioms, one for each predicate.

More on the Axiom of Choice or Well-Ordering

The Axiom of Choice is what we called axiom 6 in our set theory that we made. ZFC is commonly stated as requiring a different axiom, the Well-ordering axiom; the two are equivalent, as we will prove later.

Some Basic Proofs

First, can we show that the intersection of two sets is a set? Sure! Suppose we have two sets, \(A\) and \(B\). Then, the predicate \(\phi x = x \in B\) can be used in the axiom of specification on \(A\). This will give all elements of \(A\) that are also elements of \(B\), which is what we want.

What about that whole well-ordering being equivalent to our axiom 6 above? Well, if we can well-order any set, then we can well-order the union of all of the sets we want to draw elements from. And then we can, from each set, pick the minimal element (according to our order). So this shows that well-ordering implies our axiom 6. To get the other direction, consider a set where the first element is our original set, and then for every set, that set minus the element we choose from it is the next element (of course, the elements are unordered, the words "next" and "first" are there simply to aid intuition). Clearly, our choices will define an order on the elements, and we can always find a minimal element by taking the union of the sets each element was chosen from, and then the element chosen from that. So we have that well-ordering is equivalent to our axiom 6.

Now for a very beautiful results that is based on the axiom of the Power Set: Cantor's diagonalization. We should note that there are three theorems which go by this name: this one, this one but less general, and then another proof also attributable to Cantor which has nothing to do with these two. So we warned when someone is talking about Cantor's daigonalization that they may mean something entirely different.

So, the basic question Cantor asked is, what does it mean for two sets to have the same size? I mean, we can count finite sets, sure, you know, a set of seven elements is the same size as a set of seven elements. No problem. But Cantor was brilliant enough to see that this silly question actually needed asking, and actually contained beautiful, beautiful mathematics behind it. His attempt at an answer was, two sets are the same size if they can be paired up. So you know how you lay out forks and knives at dinner; how do you make sure you didn't miss one? It's probably not by counting both and ensuring that there are the same number; instead, you just check that each fork has an associated knife nearby.

Now, this pretty clearly is the same as our idea for finite sets, but Cantor had yet another moment of genius: what happens if we apply this to infinite sets. At this point things get crazy, because if you consider the even integers, and all of the integers, you can pair them one to one: pair \(n\) with \(2n\). So there are as many integers as even integers… But this is clearly bollocks, as there are obviously more integers than just the even ones! This is actually a defining characteristic of infinite sets: they are the same size as some of their subsets. So now Cantor set about to prove that all infinite sets are the same size, which would be rather pretty, since that would give us a very nice set of "sizes" for sets: 0, 1, etc., infinity. And this is where the plot thickens, because as it turned out, there are different sizes of infinite sets! For example, it is provable that the integers and the reals have different "cardinality", or size! We'll prove a more general variant of that now.

Theorem: A set and its power set are never of the same cardinality.

This is clearly true for finite sets, since the power set of a set of \(n\) elements has \(2^n\) elements, and that is always bigger. But for an infinite set? Well, what we want to do is to suppose that they are of the same size. Then, we can define some mapping \(f : S \to 2^S\) (where \(2^S\) represents the power set of the set \(S\)) which establishes this pairing; we claim that \(f(S) = 2^S\), that we don't "miss" any element. What we will do now is to construct an element that \(f(S)\) cannot contain that is in \(2^S\); this would be an element that we "miss", demonstrating that we have not actually paired off the elements. That element will be constructed very simply. Remember that this is an element of \(2^S\), so we have to pick which elements of \(S\) to include. What we will do is, we will include an element \(x\) if and only if it is not included in \(f(x)\). This new set \(y\) we constructed thus differs from any \(f(x)\) in that the two disagree on the inclusion of \(x\)! This didn't use any knowledge of the mapping though! This works for any mapping! So that means that no mapping can pair \(S\) and \(2^S\), and so we must have different sizes of infinity.

And now to specialize to the case of the natural numbers and the real numbers, we can consider a subset of the integers to be a real number just by considering it to represent which binary bits are "on". So \(\{1, 4, 7\}\) would correspond to \(\frac12 + \frac1{16} + \frac1{128}\). This shows that there are more reals between 0 and 1 than integers; you can show (pretty easily; do it) that there are as many reals as there are reals between 0 and 1.


If you want more on Cantor's theory, cardinality, and things of that nature, hit up your friendly neighborhood wikipedia for topics like "Cantor's diagonalization", "Cardinal numbers", or "Ordinal numbers" and follow the links from there.

Constructing the Rest of Math, Natural Numbers First

We originally came here to construct all of math out of set theory. So let's begin. First, we will construct the natural numbers. Now, the natural numbers are defined by this set of axioms called the Peano axioms; we will have to find a way of calling some sets "numbers" so that these sets satisfy these axioms. The axioms:

  1. There exists an equivalence relation, \(a = b\), on numbers.
  2. There exists a successor function, \(S n\), so that for any number \(n\), \(S n\) is also a number.
  3. There is a number \(0\), and it is not the successor of any number; every number will be it's $n$-th successor, for some \(n\).
  4. The successor function is one-to-one; that is, if \(S n = S m\), then \(n = m\).
  5. Finally, the most crucial of the properties: any set \(K\) that contains \(0\) and contains, for every number \(n\), \(S n\), contains all numbers.

The last axiom is so crucial because it allows us to do proofs by induction; in fact, it is usually called the inductive axiom.

So, how are we going to construct these numbers? Well, since we want an infinite set, we know that we are going to use the axiom of infinity, so let's adopt the same function \(S\) as there: \(S n = n \cup \{n\}\), and \(0 = \emptyset\). Since we want the successor function to correspond to addition of one, we have \(0 = \{\}\), \(1 = \{0\}\), \(2 = \{0, 1\}\), and so one, each integer being the set of smaller integers.

Now, let's go through the axioms. The first, of equality, is satisfied by that observation above: a number is the set of smaller numbers. This means that we can use normal set equality to compare numbers.

The axiom of infinity gives us a set that is a superset of the integers; but some subset of the set it constructs will have the properties of axioms 2 and 3.

Axiom 4 is satisfied, because if \(S n = S m\) for different \(m\) and \(n\), \(S n\) would have to contain both \(m\) and \(n\) as elements; thus, both \(m \subseteq n\) and \(n \subseteq m\), and the sets have to be equal.

Finally, the induction axiom. To prove this, we need to get a bit more hands on with the axiom of infinity. First, we'll call a set "inductive" if it contains the null set and the successor \(S x\) for every member \(x\). Now, an inductive set need not just be the integers; it could also contain other "garbage" along with the integers. What we want is the intersection of all inductive sets (since they all must contain the integers). We will write this, to stay entirely within our system, as "a set \(n\) is in the set of natural numbers if there does not exist an inductive set of which it is not a member". There can only be one of these sets, since if there were two, each would have to be a subset of the other (and thus they would be equal). Finally, this gives us induction. The set \(K\) the axiom requires is a subset of the natural numbers, but the naturals are very clearly a subset of them, so the two sets must be the same.

The Integers

The Rationals

The Reals

The Complex

Our final bootstrap into a nice, complete mathematical field is to create the complex numbers; this is done rather simply, defining a complex number \(z\) to be the ordered pair \((a, b)\), where addition is defined by \((a, b) + (a', b') = (a + a', b + b')\) and multiplication by \((a, b)(a', b') = (aa' - bb', ab' + a'b)\). It is tedious but completely trivial to check that all of the properties we would like of these operations to have (commutativity, associativity, having inverses, distributing) they indeed do. To show that we can solve any polynomial equation (which is our ultimate goal for complex numbers) can be done in many ways; this is the Fundamental Theorem of Algebra. The proof of this on Wikipedia is actually rather technical (and beautiful. But technical). If you want a better one, email me and I'd be glad to provide one, but I don't know what the interest is.


Now we are done! We have our mathematics! Well… sort of. We certainly have enough math to start actually discovering new and exciting things: analysis lays just at our fingertips. With the same techniques as demonstrated here, we can just as easily continue to modern group theory, algebra in general, topology, or any of an infinite number of other field. Just about all of math is definable with the axioms made here. (Some fields, like category theory, require a notion called a "class", which informally is a set that's too big to be a set; we'll discuss those below, and they require a different, but very very similar, axiom system such as NBG.) So what follows are some other interesting set-theoretic topics.

Mathematical Logic