Functions and Relations in First Order Logic
In a recent conversation, James enlightened me about the connection between functions and relations in first order logic. In this short post I'd like to share the insight.
Functions, of course, are a certain kind of relation—a total, functional relation. In set theory we directly define functions that way, and in any first order logic we think of a function f(x)
and the relation f(x) = y
as representing the same thing. Yet there is a definite difference between the two: syntactically, they inhabit different slots, and semantically, functions are a lot easier to deal with in most cases (since you can compute an output without having to check side conditions). So what is the relationship between the two?
From functions to relations
Say you have some first-order logic formula, in prenex form (quantifiers out front), so that it looks like Q σ, φ[σ]
, where Q
is some set of nested quantifiers, σ
is the list of variables those quantifiers quantify over, and φ[σ]
is the quantifier-free part of the formula. Now let's suppose that φ
calls f
somewhere in its statement, so that φ[σ]
is actually φ[f(γ[σ]), σ]
. How can we rewrite the formula to have equivalent meaning, using the relation form f(x) = y
instead of f(x)
?
The key insight is that we introduce a fresh variable for the "output" of the function. We introduce it existentially; this is safe because such a value also exists, by the fact that the relation is total. So we can rewrite:
Q σ, φ[f(γ[σ]), σ] ⇝ Q σ ∃ x, f(γ[σ]) = x ∧ φ[x, σ]
You can do this rewriting once for every function invocation and end up with a logical formula, equivalent to the original, that only uses functions in their relation form. At this point, you can replace every function symbol with a true relation whose totality and functionality are added as axioms.
From relations to functions
This reduction from functions to relations makes functions seem uninteresting: just relations with some axioms; why not other axioms or something? Why are functions so interesting and useful? How do those axioms become so useful?
It turns out (this is the insightful part that James explained to me) that you can use skolemization1 [1 Note that normally in first order logic, skolemization refers to adding a new function symbol, not quantifying over a new function. Most of the same reasoning would apply as in this post, but you would lose the direct appearance of a meta-theoretic function, so I am using this higher-order form of skolemization instead.] to go back from relations to functions. The idea here is that any time you have a formula like this:
∀ x : A, ∃ y : B, φ[x, y]
You could rewrite it in such as way as to move the ∃
to the outside:
∃ f : A → B, ∀ x : A, φ[x, f(x)]
By repeatedly doing this process, you can turn the prenex form formula into a formula with a single quantifier alternation (∃ ∀) as long as you were willing to transition into a higher-order logic that had a notion of functions. Applying this trick to the relationized function from the previous sections, we get:
Q σ ∃ x, f(γ[σ]) = x ∧ φ[x, σ] ⇝ ∃ f' Q σ, f(γ[σ]) = f'(σ) ∧ φ[f'(σ), σ]
Now it is clear that if f
is a function, in a sense corresponding to the higher-order notion of function (which, if you have the usual function axioms and are working in a normal logic, is guaranteed), then such an f'
must exist—it is f'(σ) = f(γ[σ])
. And with this definition the first conjunct is trivially true, so we can remove both the ∃
quantifier from the front as well as the first conjunct, and expand f'
to rewrite
∃ f' Q σ, f(γ[σ]) = f'(σ) ∧ φ[f'(σ), σ] ⇝ Q σ, φ[f(γ[σ]), σ]
which is right where we started.
Conclusion
Functions are relations, but they gain additional power for a "spooky correspondence": a function term in the theory corresponds2 [2 Usually?] to the meta-theoretic notion of a function, and in particular the one that arises from using skolemization to turn quantifier alternation into higher order logic. This intuition helps clarify, I think. The intuition might also be useful in thinking about what is achievable with linear meta-theories or other meta-theories with restricted notions of function.
Edit: Thanks to James for reminding me that it is called “skolemization” and pointing out that I am using a slightly weird higher-order form.
Footnotes:
Note that normally in first order logic, skolemization refers to adding a new function symbol, not quantifying over a new function. Most of the same reasoning would apply as in this post, but you would lose the direct appearance of a meta-theoretic function, so I am using this higher-order form of skolemization instead.
Usually?