WIP: Refinement in Mathematics
This post is a little different from the usual one. I want to talk about a pattern that I see happen often in mathematics, that of refining propositions to objects. I'm sure that, like other patterns in mathematics, it can be formalized and studied separately, but I'm not enough a mathematician to do that. If you have ideas on a more abstract view on what I'm calling refinement, please write. I'll do the best I can, formalizing refinement in a category-theoretic framework
Refining divisibility
By “refinement” I mean going from a proposition we want to study, to an object that represents the possible “obstructions” to that proposition being true. Then by studying the possible obstructions, we learn more about when the proposition is true. This is a bit vague, so let's consider an example.
Consider the question of whether one number is divisible by another; in particular, let's look at divisibility by seven. What numbers are divisible and not divisible by seven? Well, this is of course a question we can answer—multiples of seven are divisible by it, others are not.
But so far, we are just thinking about whether individual numbers are divisible by seven. Next, we would like to expand our knowledge by understanding how relations between numbers are reflected in relations between their divisibility by seven.
For example, one relation between numbers is that of product—we say one number is the product of two numbers. And a relation exists between their divisibility that matches the relation of product: if we know whether \(a\) and \(b\) are divisible by seven, we also know whether \(a b\) is divisible by seven. We might say that the relation of product maps to the relation of “or” between truth values.
On the other hand, the relation of sum doesn't map cleanly into a relation between divisibilities. That is, if we know whether \(a\) and \(b\) are divisible by seven, we don't know whether \(a + b\) is, at least in the case where \(a\) and \(b\) are not divisible by seven.
To be able to answer questions about the divisibility of sums of numbers, we instead must discover a new proposition to study: modular residue. So instead of talking about a single proposition (whether or not a number is divisible by seven), we talk about seven propositions (whether it has residue 0, 1, 2, 3, 4, 5, or 6). If we know the mod-7 residues of two numbers, we also know the mod-7 residue of their sum (by the rules of modular arithmetic), and knowing the mod-7 residue of a number, we know whether that number is divisible by seven.
Thinking with categories
Skip this section if category theory is not your cup of tea.
In a category-theoretic framework, we would say that divisibility by seven is a function from the set of integers to the set of propositions, or truth values: \(\mathsf{divisibility}_{Set} : \mathbb{Z} \to_{Set} 2\). But the integers are not just a set. In fact, the numbers are a monoid, a ring, an inductive structure, and so much more! So the question is, does \(\mathsf{divisibility}_Set\) extend to some function from the monoid of integers to a monoid of propositions, or from the ring of integers to the ring of propositions?
For example, if we know whether two numbers are divisible by seven, we also know whether their product is. The integers with a product operation are a monoid, and since the operations of the monoid map to operations on their divisibilities, we know that the divisibility of numbers by seven also forms a monoid. This “divisibility monoid” has false as its identity (1, the multiplicative identity, is not divisible by seven) and “or” as its product. So we have a function \(\mathsf{divisibility}_{Monoid} : \mathbb{Z} \to_{Monoid} 2\), where \(2\) is now the monoid with identity \(0\) and product \(\lor\). And this function extends \(\mathsf{divisibility}_{Set}\), because the forgetful functor from monoids to sets, which maps the monoid of integers to the set of integers, and the monoid of truth values to the set of values also maps \(\mathsf{divisibility}_{Monoid}\) to \(\mathsf{divisibility}_{Set}\).
Furthermore, integers can be added, and form a ring under addition and multiplication. But now there is not a function between rings \(? : \mathbb{Z} \to_{Ring} 2\), because addition of two integers doesn't map cleanly to an operation on divisibilities. Instead, the best we can do is a function \(\mathsf{divisbility}_{Ring} : \mathbb{Z} \to_{Ring} \mathbb{Z}/(7)\). We still have a functor that will map the ring of integers to the set of integers and map \(\mathbb{Z}/(7)\) to the set of truth values, which maps \(\mathsf{divisibility}_{Ring}\) to \(\mathsf{divisibility}_{Set}\), so we have again extended divisibility by seven to handle the ring structure of the integers. But now we have had to refine the notion of divisibility, because we now get one of seven answers when we ask about divisibility. Which tells us that back in the land of \(Set\), the \(\mathsf{divisbility}_{Set}\) factors into two maps, one in \(\mathbb{Z} \to_{Set} 7\) and another in \(7 \to_{Set} 2\). The first map extends beyond just \(Set\), and has analogs in \(Ring\), while the second exists in \(Monoid\) and \(Set\) but not in \(Ring\).
In other words, I am asking: suppose we have a collection of concrete categories, each of which have an analog of some object \(X\) (above, it was \(\mathbb{R}\)). That is, each category \(C\) has an object \(X_C\) such that the forgetful functor \(F_C\) maps \(X_C\) to \(X\). And also suppose a map \(p : X \to_{Set} \Omega\), which is the mapping we'd like to refine. For each category, I would like to identify the universal object \(Y_C\) and maps \(f : X_C \to_C Y_C\) and \(g : F_C Y_C \to \Omega\) such that \(p = (F_C f) g\). Such a universal object ought to always exist; for example, \(Y_C = X_C\) has the necessary maps, and only lacks the universality condition.1 [1 I don't know enough category theory to know if this is actually sufficient. Help?]
If these universal objects can be identified, they should have the same structure as the collection of categories itself (which are structured by factorization of their forgetful functors). Could this allow studying the structures that can be built on a category, by studying the structures that exist in the category?
From this category-theoretic point of view, we can formulate all sorts of additional questions. For example, given all the possible structures we can define on the integers, what are the associated notions of divisibility? We can always refine the integers to themselves, and then answer arbitrary questions about the integers and accurately represent all operations on the integers, but what's interesting is that we can sometimes do better.
In the divisibility example, where we eventually refined divisibility to modular residue, we can look at the seven residue classes as “obstructions” to divisibility by seven. There’s the trivial obstruction—that is, the lack of an obstruction—along with six different ways a number can fail to be divisible. To see another example of this, where the obstructions are more physical, let's look at defining groups on graphs.
Groups on graphs
Given a connected, undirected graph \(T\) and a group \(G\), we say that an equation on \(T\) in \(G\) is a choice of group element \(g_{uv}\) for each edge \((u, v) \in T\), where \(g_{vu} = g_{uv}^{-1}\). A solution to the equation with \(v_0 \mapsto g_0\) is a choice of group element \(g_v\) for each vertex \(v \in T\), such that for all vertices \(u, v \in T\), \(g_u g_{uv} = g_v\), and such that \(g_{v_0} = g_0\). Note that if two nodes in \(T\) are connected by an undirected edge, then the requirement that \(g_{uv} = g_{vu}^{-1}\) is necessary to ensure the equation is solvable.
We are interested in the question of which equations are solvable. Luckily, in solving an equation there aren't too many choices to be made: label \(v_0\) with \(g_0\), and then just traverse the graph. Any time you follow an edge \((u, v)\) from a node labeled \(g_u\), label \(v\) with \(g_u g_{uv}\). If we never attempt to label a vertex \(v\) with two different labels, this produces a solution; and conversely, if the equation has a solution, this method will find it, because this method makes no choices and thus produces a unique solution. On the other hand, an equation might not be solvable. For example, the cycle on vertices \(a\), \(b\), and \(c\) can be annotated by integers under addition, by making \(g_{ab} = g_{bc} = g_{ca} = 1\). This equation is not solvable, no matter the choices of \(v_0\) and \(g_0\), because we would have \(g_c = g_b + 1 = g_a + 2 = g_c + 3\), which is impossible.
In general, if there is a cycle in the graph, then we can pick an orientation for it and label the edges along the cycle (oriented as chosen) with 1. If the cycle goes through vertices \(v_0, \dots, v_{n-1}, v_0\), then for the equation to be solvable we would need \(v_0 = v_{n-1} + 1 = v_{n-2} + 2 = \dots = v_0 + n\), which is impossible. In fact, for an equation to be solvable, we need the product of the group elements along any cycle to be the identity—that is, any cycle corresponds to a relation on the elements of the group, which prevents certain equations from being solvable.
This insight allows us to understand how certain relations between equations relate their solvability. For example, we can imagine changing the labeling of a single edge. Does that affect solvability? Only if that edge is along a cycle, in which case that cycle corresponds to a relation which would be violated by changing the label on the edge.
We can formalize the cycles on a graph using the fundamental group. Given any cycle in the graph \(v_0, \dots, v_{n-1}, v_0\), we have a relation \(g_{01} g_{12} \cdots g_{(n-1)0} = e\) in the group. Given two relations \(r_1 = e\) and \(r_2 = e\), the relation \(r_1 r_2 = e\) is also valid; this corresponds to the product \(c_1 c_2 \in \pi_0\) of two cycles \(c_1, c_2 \in \pi_0\). So the fundamental group of the graph describes the set of relations a group must satisfy for an equation to be solvable; and if the fundamental group is trivial, then any equation is solvable. We have refined the existence of unsolvable equations on a graph to its fundamental group.
And this refinement also extends to additional operations on graphs, like the product or wedge product of two graphs. The fact that this refinement exists gives as some justification in saying that the fundamental group of a graph describes obstructions to the solvability of equations on it.
Work in progress
If you have other examples of mathematical refinement, or a generalized or abstract way of looking at this phenomenon, please write. I'm interested in this process of refinement, including questions like determining when it is possible or figuring out ways to do it automatically; or, similar problems in computer science, physics, and other fields.
Footnotes:
I don't know enough category theory to know if this is actually sufficient. Help?