What is a Model?
I’ve recently become interested in model theory, and as I’ve started reading about it, one question in particular has been eating on me: why do we need to do model theory at all? This series of posts will answer that question, hopefully saving others the trouble of figuring it out for themselves. This post describes the motivation for model theory and the language these posts will model.
What is logic?
In mathematics we often talk about logic as being a set of “obviously true” rules for how to reason. But if we want to talk about logic as a mathematical object, we’ll need a better definition. Let’s discard any philosophy of math, and go for what appears to be an unsatisfying definition: logic is a set of rules for manipulating symbols. Traditionally, these are separated into two parts: the signature, which describes the symbols; and the axioms, which describe the rules.
Now, this definition might be a bit unsatisfying, since it sounds like we have broad latitude in picking the symbols and the rules, and yet most things we call logic tend to work in more-or-less the same way. And, of course, it doesn’t match the way we think about logic: we don’t treat logic as symbol-manipulation, but as a way to get at something like “truth”. But logic as mindless symbol manipulation is a useful fiction; we will add back notions of truth by talking about models in just a bit.
Here’s an example of a logic; it’ll be the running example throughout these four posts. It contains strings of symbols such as
∀ x, (x → ⊤) (∀ x, x) → ⊥ (∀ x, (y → x)) ∧ y → ⊥ ⊤ → ⊥ (x → ⊥) ∨ (⊤ → x)
Note that some of these are “true” and some are “false”. We aren’t actually thinking about truth values. Instead, we are looking at these terms as just strings of symbols.
We can also transform one string of symbols into another. For example, we allow ourselves to transform x
into x ∨ ⊥
, or transform ⊤
into x → x
. It’s tempting to think of these rules as “axioms” or “true inference rules”, but it’s best to avoid that here and think about them as completely formal symbol manipulation.
What makes a logic useful?
The current definition of a logic is unsatisfying because it’s missing some element of “meaning” or “truth” behind the formal symbols. After all, if logic really were nothing more than a formal game, you’d expect people to do logic only for fun; yet somehow they seem to do it because it’s useful for some other purpose. So what can make a logic (this set of strings and string transformations) useful?
Model theory takes the following philosophical stance: logic is useful when it describes “real” objects. “Real” is in quotes, because often these objects are just mathematical abstractions; they’re real only in that they have more inherent substance than symbol manipulation. The object is called a model, though sometimes you think of the logic as modeling the model, not the model modeling the logic.
But if we’re just trying to describe a real-world object, why not talk about it directly instead of trying to go indirectly through a logic? One answer is that the logic is often simpler to talk about than the object itself, but another and truer reason is that a logic allows us to reason abstractly about many real-world situations at once. That is, a logic can have multiple models, and the logic then abstracts over their differences.
For an example, you can think of arithmetic as a set of meaningless symbol-manipulation. This symbol-manipulation involves strings like 3 + 5
, and rules that allow us to transform 3 + 5
into 8
. This symbol-manipulation is useful because those symbol manipulations somehow describe the behavior of real-world objects, like apples and dollars and groups of people. Every word problem is an exercise in model theory!
But by abstracting all of the various examples of combinations of three things and describing them by the symbol 3
, we can talk about all examples of three-ness at once. So all examples anywhere in the world of three-ness, five-ness, and addition, which satisfy the rules we have adopted for transforming strings, will also validate the claim that the three-thing and the five-thing added together are an eight-thing (whatever that may be).
A good recent example of inventing a logic to abstract over a model is Homotopy Type Theory, which describes a way of talking about mathematical structures called homotopy types using a logic that looks a lot like dependent type theory. Using the logic is much simpler (in many cases) that talking about homotopy types directly, since homotopy types are complex and abstract, while the logic is much more concrete.
Models for studying logics
It sounds like logics are meant for describing models. But often we instead use models to study logics. For example, to show that a certain theorem is unprovable in a logic, one can find a model of that logic where the theorem is false. Since all models have all the properties that can be proven in the logic (by definition of a model), the logic cannot possibly prove something that’s false in a model. This is a much easier way of establishing a theorem’s unprovability than talking directly about symbol-manipulation. To prove it directly, you’d have to show that any possible sequence of applications of the rules wouldn’t work, and that’s hard if the rules interact in interesting ways.
What gives this approach its power is the ability to play different models off against each other. Often, one invents a logic for describing one particular model, and then investigates the differences between the original model and any other models for that logic. This can yield intuition for the logic.
This series of posts will lead up to a particular example of this phenomenon. We will focus on the logic above, where we pick a set of rules mirroring constructive, or intuitionistic logic—there is no axiom along the lines of ∀ x, x ∨ ¬ x
. If you think of propositions as having boolean truth values, the proposition
∀ x, (x → ⊥) ∨ (⊤ → x)
appears to be true, since ⊤ → true
and false → ⊥
.. But there’s no obvious way to prove it in our logic. To resolve this mystery, we’ll show that there is a model of our logic where this proposition is not true. So we’ll use model theory to establish that a certain proposition is not provable.
As a proof of concept for myself, the next three posts will be formalized in Coq, and I’ll take time to explain the Coq formalization. This is just a sanity check, since we won’t be doing particularly complex things in Coq, but model theory can get very complex for bigger logics, and having a proof assistant quickly becomes handy.