Pairs in Hindley-Milner Languages

You don’t need sum or products as built-ins in the untyped lambda calculus. But when you try to encode them in a typed calculus, you have to take care.

Producing products (in the untyped λ-calculus)

When I first studied the untyped λ-calculus, I learned that first-class functions let you encode pairs without any extra effort. It works by storing the two halves of the pair in the context, and to extract them by passing both to a function.

pair = (λ x y ↦ (λ f ↦ f x y))

To show that this is a pair, we need to be able to get out either the first or the second element; we can do this because we can pass an f that ignores its first or second arguments.

fst = (λ p ↦ p (λ x y ↦ x))
snd = (λ p ↦ p (λ x y ↦ y))

We can check that these functions work:

fst (pair 1 2) = (λ p ↦ p (λ x y ↦ x)) ((λ x y ↦ (λ f ↦ f x y)) 1 2)
               = (λ p ↦ p (λ x y ↦ x)) (λ f ↦ f 1 2)
               = (λ f ↦ f 1 2) (λ x y ↦ x) = (λ x y ↦ x) 1 2
               = 1

snd (pair 1 2) = (λ p ↦ p (λ x y ↦ y)) ((λ x y ↦ (λ f ↦ f x y)) 1 2)
               = (λ p ↦ p (λ x y ↦ y)) (λ f ↦ f 1 2)
               = (λ f ↦ f 1 2) (λ x y ↦ y) = (λ x y ↦ y) 1 2
               = 2

We can also write an interesting identity function:

repair = (λ p ↦ pair (fst p) (snd p))

All of these functions work perfectly in the untyped calculus. In an implementation, pairs might be nice to have—they might print nicely or save memory—but they wouldn’t be necessary.

Typical typings (in Hindley-Milner systems)

If we suddenly want to introduce types into our world, a seemingly-sensible approach is to use the same exact definition of pair, fst, and snd.

pair = (λ x y ↦ (λ f ↦ f x y))
fst = (λ p ↦ p (λ x y ↦ x))
snd = (λ p ↦ p (λ x y ↦ y))

I’m using Haskell as my typed λ-calculus, so my type system is HM by default. I can type in these definitions, and they all work. If we ask for their types, we get a reasonable type for pair...

pair : τ → σ → (τ → σ → α) → α

… but strange types for fst and snd:

fst : ((τ → σ → τ) → α) → α
snd : ((τ → σ → σ) → α) → α

A pair of τ and σ represented by a function of type (τ → σ → α) → α, we can read that off the type of pair. But then we expect fst to have type ((τ → σ → α) → α) → τ, since it takes a pair of τ and σ, and returns the first element, which must be a τ.

Why does this happen? Well, it’s easy to see how Haskell came up with our types for fst and snd. After all, fst is just the function λ p ↦ p (λ x y ↦ x). So it takes an input p and calls it on λ x y ↦ x. This argument has type τ → σ → τ, and the result of calling p might be any type. So p must have type (τ → σ → τ) → α, and then the result of fst must be α.

Wait, so maybe we can just tell Haskell the correct type? No. If we explicitly give fst the signature ((τ → σ → α) → α) → τ, we get an error message, reading “Couldn't match expected type ‘t’ with actual type ‘a’.”

Unfortunate un’fications (during HM type inference)

The problem is hidden by the implicit quantifiers of HM type systems. When Haskell talks about the type ((τ → σ → α) → α) → τ, it puts the quantifiers for τ, σ, and α on the outside; it really means the type

∀ τ σ α, ((τ → σ → α) → α) → τ

If fst had this type, then the caller of fst would choose appropriate values for τ, σ, and α. But then what if the caller chooses the wrong value of α? For example, suppose the caller chose α = 0? Then our fst function would be a double-negation-eliminator!

The real issue here is that our fst function needs to choose a particular value for α. Our fst needs α = τ; our snd needs α = σ. So we can’t have the choice of α happen outside the call to fst or snd.

We can “internalize” the choice of α by moving our quantifiers. For fst this gives the type:

∀ τ σ, (∀ α, (τ → σ → α) → α) → τ

We call types “higher rank types” when they have ∀s anywhere except the very outside.

We might see a similar problem with the type of pair. Haskell prints the type of pair as

τ → σ → (τ → σ → α) → α

which we might elaborate to

∀ τ σ α, τ → σ → (τ → σ → α) → α

With this definition, α would have to be chosen as soon as the pair is created. Unless τ and σ were equal, we wouldn’t be able to get both the first and second element of the pair, since we’d have to set α to both τ and σ.

This is in fact what happens with repair, which we defined as

repair = (λ p ↦ pair (fst p) (snd p))

If we use the old definitions of pair, fst, and snd, its type is inferred as

((τ → τ → τ) → σ) → (σ → σ → α) → α

This type is unfortunate because it requires both halves of the pair to have the same type. But if you stare at it long enough, this type judgement makes sense. Since we call both fst and snd on our input pair, it must have type (τ → σ → τ) → α and (τ → σ → σ) → α. So τ and σ must be the same, and then of course both the fst and snd calls return values of the same type.

The main problem for repair is that Haskell not only doesn’t infer that outputs can have quantifiers in strange places, but it also doesn’t assume that inputs might. So if we give pair, fst, and snd the correct types, the definition of repair returns an error because HM type inference cannot fathom that the input pair might have a universally-quantified type. It’ll all work If we manually give the correct type for repair:

∀ τ σ, (∀ α, (τ → σ → α) → α) → (∀ β, (τ → σ → β) → β)

So this is why typed λ-calculi usually have pairs as a separate type. The pair implicitly is a higher-rank type, but if pairs are built explicitly into the language, we can infer types for functions that use pairs without having to explicitly infer higher-rank types. Inference for general higher-rank types is undecidable, so building in common use cases makes the type system much simpler to use.

Politely polymorphising (in let expressions)

“But wait,” you say, “I used my incorrectly-typed pair function on two values of different types and got the correct, polymorphic result!”

> let p = pair 1 ‘a’
p : (ℤ → Char → α) → α

This is due to a strange feature of HM. You see, it’s incorrect to say that HM requires all quantifiers to be on the outside. If that was the case, this let statement would have to pick a specific α for p; instead, we seem to have retained the correct level of polymorphism.

In a let expression, HM will allow the output type to retain some polymorphism. So let v = f in b is not equivalent to (λ v ↦ b) f; the first allows v to be polymorphic and the second does not. In the second case, HM has to separately give a type to λ v ↦ b, and this function would have to have a higher rank type if we wanted v to be polymorphic. But if we use let, HM “realizes” that we never cared explicitly about the function, and doesn’t bother computing its type.

So while you can write

let p = (pair 1 ‘a’) in (pair (fst p) (snd p))

you can’t write

repair p = pair (fst p) (snd p)

What a strange restriction! We can write the statement pair (fst p) (snd p) for a concrete p, but not for a general one. This is one reason why HM is really a rather weird type system. Sure, you can infer it efficiently and a lot of normal code seems to fit within its boundaries. But when you sit down to figure out what its actual limitations are, you find that HM brings with it all sorts of strange restrictions.

Mathematically, you don’t want HM; you want System F, which allows ∀s anywhere in a type. That, or dependent types, which allow so much more. But both those type systems require a lot more hand-holding, because general inference isn’t possible for either of them. Maybe we should all go back to simply-typed λ-calculus?

By Pavel Panchekha

01 February 2014

Pairs in Hindley-Milner Languages

Producing products (in the untyped λ-calculus)

Typical typings (in Hindley-Milner systems)

Unfortunate un’fications (during HM type inference)

Politely polymorphising (in let expressions)