# Pairs in Hindley-Milner Languages

You don’t need sum or products as built-ins in the untyped lambda calculus. But when you try to encode them in a typed calculus, you have to take care.

## Producing products (in the untyped λ-calculus)

When I first studied the untyped λ-calculus, I learned that first-class functions let you encode pairs without any extra effort. It works by storing the two halves of the pair in the context, and to extract them by passing both to a function.

pair = (λ x y ↦ (λ f ↦ f x y))

To show that this is a pair, we need to be able to get out either the first or the second element; we can do this because we can pass an `f`

that ignores its first or second arguments.

fst = (λ p ↦ p (λ x y ↦ x)) snd = (λ p ↦ p (λ x y ↦ y))

We can check that these functions work:

fst (pair 1 2) = (λ p ↦ p (λ x y ↦ x)) ((λ x y ↦ (λ f ↦ f x y)) 1 2) = (λ p ↦ p (λ x y ↦ x)) (λ f ↦ f 1 2) = (λ f ↦ f 1 2) (λ x y ↦ x) = (λ x y ↦ x) 1 2 = 1

snd (pair 1 2) = (λ p ↦ p (λ x y ↦ y)) ((λ x y ↦ (λ f ↦ f x y)) 1 2) = (λ p ↦ p (λ x y ↦ y)) (λ f ↦ f 1 2) = (λ f ↦ f 1 2) (λ x y ↦ y) = (λ x y ↦ y) 1 2 = 2

We can also write an interesting identity function:

repair = (λ p ↦ pair (fst p) (snd p))

All of these functions work perfectly in the untyped calculus. In an implementation, pairs might be nice to have—they might print nicely or save memory—but they wouldn’t be necessary.

## Typical typings (in Hindley-Milner systems)

If we suddenly want to introduce types into our world, a seemingly-sensible approach is to use the same exact definition of `pair`

, `fst`

, and `snd`

.

pair = (λ x y ↦ (λ f ↦ f x y)) fst = (λ p ↦ p (λ x y ↦ x)) snd = (λ p ↦ p (λ x y ↦ y))

I’m using Haskell as my typed λ-calculus, so my type system is HM by default. I can type in these definitions, and they all work. If we ask for their types, we get a reasonable type for `pair...`

pair : τ → σ → (τ → σ → α) → α

… but strange types for `fst`

and `snd`

:

fst : ((τ → σ → τ) → α) → α snd : ((τ → σ → σ) → α) → α

A pair of `τ`

and `σ`

represented by a function of type `(τ → σ → α) → α`

, we can read that off the type of `pair`

. But then we expect `fst`

to have type `((τ → σ → α) → α) → τ`

, since it takes a pair of `τ`

and `σ`

, and returns the first element, which must be a `τ`

.

Why does this happen? Well, it’s easy to see how Haskell came up with our types for `fst`

and `snd`

. After all, `fst`

is just the function `λ p ↦ p (λ x y ↦ x)`

. So it takes an input `p`

and calls it on `λ x y ↦ x`

. This argument has type `τ → σ → τ`

, and the result of calling `p`

might be any type. So `p`

must have type `(τ → σ → τ) → α`

, and then the result of `fst`

must be `α`

.

Wait, so maybe we can just *tell* Haskell the correct type? No. If we explicitly give `fst`

the signature `((τ → σ → α) → α) → τ`

, we get an error message, reading “Couldn't match expected type ‘t’ with actual type ‘a’.”

## Unfortunate un’fications (during HM type inference)

The problem is hidden by the implicit quantifiers of HM type systems. When Haskell talks about the type `((τ → σ → α) → α) → τ`

, it puts the quantifiers for `τ`

, `σ`

, and `α`

on the outside; it really means the type

∀ τ σ α, ((τ → σ → α) → α) → τ

If `fst`

had this type, then the *caller* of `fst`

would choose appropriate values for `τ`

, `σ`

, and `α`

. But then what if the caller chooses the wrong value of `α`

? For example, suppose the caller chose `α = 0`

? Then our `fst`

function would be a double-negation-eliminator!

The real issue here is that our `fst`

function needs to choose a particular value for `α`

. Our `fst`

needs `α = τ`

; our `snd`

needs `α = σ`

. So we can’t have the choice of `α`

happen *outside* the call to `fst`

or `snd`

.

We can “internalize” the choice of `α`

by moving our quantifiers. For `fst`

this gives the type:

∀ τ σ, (∀ α, (τ → σ → α) → α) → τ

We call types “higher rank types” when they have ∀s anywhere except the very outside.

We might see a similar problem with the type of `pair`

. Haskell prints the type of `pair`

as

τ → σ → (τ → σ → α) → α

which we might elaborate to

∀ τ σ α, τ → σ → (τ → σ → α) → α

With this definition, `α`

would have to be chosen as soon as the pair is created. Unless `τ`

and `σ`

were equal, we wouldn’t be able to get both the first and second element of the pair, since we’d have to set `α`

to both `τ`

and `σ`

.

This is in fact what happens with `repair`

, which we defined as

repair = (λ p ↦ pair (fst p) (snd p))

If we use the old definitions of `pair`

, `fst`

, and `snd`

, its type is inferred as

((τ → τ → τ) → σ) → (σ → σ → α) → α

This type is unfortunate because it requires both halves of the pair to have the same type. But if you stare at it long enough, this type judgement makes sense. Since we call both `fst`

and `snd`

on our input pair, it must have type `(τ → σ → τ) → α`

and `(τ → σ → σ) → α`

. So `τ`

and `σ`

must be the same, and then of course both the `fst`

and `snd`

calls return values of the same type.

The main problem for `repair`

is that Haskell not only doesn’t infer that outputs can have quantifiers in strange places, but it also doesn’t assume that inputs might. So if we give `pair`

, `fst`

, and `snd`

the correct types, the definition of `repair`

returns an error because HM type inference cannot fathom that the input pair might have a universally-quantified type. It’ll all work If we manually give the correct type for `repair`

:

∀ τ σ, (∀ α, (τ → σ → α) → α) → (∀ β, (τ → σ → β) → β)

So this is why typed λ-calculi usually have pairs as a separate type. The pair implicitly is a higher-rank type, but if pairs are built explicitly into the language, we can infer types for functions that use pairs without having to explicitly infer higher-rank types. Inference for general higher-rank types is undecidable, so building in common use cases makes the type system much simpler to use.

## Politely polymorphising (in let expressions)

“But wait,” you say, “I used my incorrectly-typed `pair`

function on two values of different types and got the correct, polymorphic result!”

> let p = pair 1 ‘a’ p : (ℤ → Char → α) → α

This is due to a strange feature of HM. You see, it’s incorrect to say that HM requires all quantifiers to be on the outside. If that was the case, this `let`

statement would have to pick a specific `α`

for `p`

; instead, we seem to have retained the correct level of polymorphism.

In a `let`

expression, HM will allow the output type to retain some polymorphism. So `let v = f in b`

is not equivalent to `(λ v ↦ b) f`

; the first allows `v`

to be polymorphic and the second does not. In the second case, HM has to separately give a type to `λ v ↦ b`

, and this function would have to have a higher rank type if we wanted `v`

to be polymorphic. But if we use `let`

, HM “realizes” that we never cared explicitly about the function, and doesn’t bother computing its type.

So while you can write

let p = (pair 1 ‘a’) in (pair (fst p) (snd p))

you can’t write

repair p = pair (fst p) (snd p)

What a strange restriction! We can write the statement `pair (fst p) (snd p)`

for a concrete `p`

, but not for a general one. This is one reason why HM is really a rather weird type system. Sure, you can infer it efficiently and a lot of normal code seems to fit within its boundaries. But when you sit down to figure out what its actual limitations are, you find that HM brings with it all sorts of strange restrictions.

Mathematically, you don’t want HM; you want System F, which allows ∀s anywhere in a type. That, or dependent types, which allow so much more. But both those type systems require a lot more hand-holding, because general inference isn’t possible for either of them. Maybe we should all go back to simply-typed λ-calculus?