A look under GHC's hood: desugaring linear types

18 January 2024 — by Arnaud Spiwack

I recently merged linear let- and where-bindings in GHC. Which means that we’ll have these in GHC 9.10, which is cause for celebration for me. Though they are much overdue, so maybe I should instead apologise to you.

Anyway, I thought I’d take the opportunity to discuss some of GHC’s inner workings and how they explain some of the features of linear types in Haskell. We’ll be discussing a frequently asked question: why can’t Ur be a newtype? And some new questions such as: why must linear let-bindings have a !? But first…

GHC’s frontend part 1: typechecking

Like a nature documentary, let’s follow the adventure of a single declaration as it goes through GHC’s frontend. Say

f True x y = [ toUpper c | c <- show x ++ show y ]
f False x y = show x ++ y

There’s a lot left implicit here: What are the types of x, y, and c? Where do the instances of Show required by show come from? (Magic from outer space?)

Figuring all this out is the job of the typechecker. The typechecker is carefully designed; it’s actually very easy to come up with a sound type system which is undecidable. But even then, decidability is not enough: we don’t want inversion of the True and the False equations of f to change the result of the typechecker. Nor should pulling part of a term in a let let s = show x ++ show y in [ toUpper c | c <- s ]. We want the algorithm to be stable and predictable.

GHC’s typechecker largely follows the algorithm described in the paper OutsideIn(x). It would walk the declaration as:

x is used as an argument of show, so it has some type tx and we need Show tx.
y is used as an argument of show, so it has some type ty and we need Show ty.
c is bound to the return of show, so it has some type tc such that [tc] is [Char] (from which we know that tc is Char).
c is used as an argument of toUpper so tc is Char (which we already knew from the previous point, so there’s no type error).
From all this we deduce that f :: Bool -> tx -> ty -> [Char].
Then the typechecker proceeds through the second equation and discovers that ty is [Char].
So the type of f is Bool -> tx -> [Char] -> [Char].
Only now, we look at tx and notice that we don’t know anything about it, except that it must have Show tx. So tx is generalised and we get the final type for f: f :: forall a. Show a => Bool -> a -> [Char] -> [Char].

Where we’re allowed to generalise is very important: if the typechecking algorithm could generalise everywhere it’d be undecidable. So the typechecker only generalises at bindings (it even only generalises at top-level bindings with -XMonoLocalBinds), and where it knows that it needs a forall (usually: because of a type signature).

Of course, there’s no way I could give a faithful account of GHC’s typechecker in a handful of paragraphs. I’m sweeping a ton under the carpet (such as bidirectional typechecking or the separation of constraint generation and constraint solving).

GHC’s frontend part 2: desugaring

Now that we know all of this type information, our declaration is going to get desugared. In GHC, this means that the expression is translated to a language called Core. Our expression looks like this after desugaring:

f = \(@a :: Type) -> \(d :: Show a) -> \(b :: Bool) -> \(x :: a) -> \(y :: [Char]) ->
  case b of
  { True -> map @Char (\(c::Char) -> toUpper c) ((++) @Char (show d x) (show (showList showChar) y))
  , False -> (++) @Char (show d x) y
  }

There are a few things to notice:

This language is typed.
Nothing is left implicit: all the types are tediously specified on each variable binding, on each forall, and on each application of a polymorphic function.
Core is a language with far fewer features than Haskell. Gone are the type classes (replaced with actual function arguments), gone are list comprehensions, gone are equations, in fact gone are deep pattern matching, “do” notation, and so many other things.

For the optimiser (Core is the language where most of the optimisations take place), this means dealing with fewer constructs. So there’s less interaction to consider. Types are used to drive some optimisations such as specialisation. But by the time we reach the optimiser, we’ve already left the frontend, and it’s a story for another blog post maybe.

For language authors, this means something very interesting: when defining a new language construction, we can describe its semantics completely by specifying its desugaring. This is a powerful technique, which is used, for instance, in many GHC proposals.

Ur can’t be a newtype

Let me use this idea that a feature’s semantics is given by its desugaring to answer a common question: why must Ur be boxed, couldn’t it be a newtype instead?

To this end, consider the following function, which illustrates the meaning of Ur

f :: forall a. Ur a %1 -> Bool
f (Ur _) = True

That is, Ur a is a wrapper over a that lets me use the inner a as many times as I wish (here, none at all).

The desugaring is very straightforward:

f = \(@a :: Type) -> \(x :: Ur a) ->
  case x of
    Ur y -> True

Because Ur a is just a wrapper, it looks as though it could be defined as a newtype. But newtypes are actually desugared quite differently and this will turn out to be a problem. That is, if Ur a was a newtype then it would have the same underlying representation as a, the desugaring reflects that we can always use one in place of the other. Specifically, desugaring a newtype is done via a coercion of the form ur_co :: Ur a ~R a, which lets Core convert back and forth between types Ur a and a (the R stands for representational). With a newtype the desugaring of f would be:

f_newtype = \(@a :: Type) -> \(x :: Ur a) ->
  let
    y :: a
    y = x `cast` ur_co
  in
  True

Because y isn’t used at all, this is optimised to:

f_newtype = \(@a :: Type) -> \(x :: Ur a) -> True

See the problem yet?

Consider f_newtype destroyTheOneRing. This evaluates to True; the one ring is never destroyed! Contrast with f destroyTheOneRing which reduces to:

case destroyTheOneRing of
  Ur y -> True

This forces me to destroy the one ring before returning True (in our story, the ring is destroyed when the Ur constructor is forced).

This is why Ur can’t be a newtype and why GHC’s typechecker rejects attempts to define an Ur newtype: such a newtype would semantically violate the linearity discipline.

Exercise: this objection wouldn’t apply to a strict language. Do you see why?

No linear lazy patterns

In Haskell, a pattern in a let-binding is lazy. That is:

let (x, y) = bakeTurkey in body

Desugars to:

let
  xy = bakeTurkey
  x = fst xy
  y = snd xy
in
body

Is this linear in bakeTurkey? You can make this argument: although it’s forced several times (once when forcing x and once when forcing y), laziness will ensure that you don’t actually bake your turkey twice. There are some systems, such as Linearity and laziness [pdf], which treat lazy patterns as linear.

But it’s awfully tricky to treat the desugared code as linear. A typechecker that does so would have to understand that x and y are disjoint (so as to make sure that we don’t bake parts of the turkey twice) and that they actually cover all of xy (so that no part of the turkey is left unbaked). And because Core is typed, we do have to make sure that the desugaring is linear.

At this point, this is a complication that we don’t want to tackle. If we wanted to, it would require some additional research; it’s not just a matter of engineering time. As a consequence, patterns in linear let-bindings must be annotated with a !, making them strict:

let !(x, y) = bakeTurkey in body

which desugars to:¹

case bakeTurkey of
  (x, y) -> body

Which is understood as linear in Core.

This is really only for actual patterns. Variables let x = rhs in body, which desugar to themselves, are fine.

No linear polymorphic patterns

Finally, let’s look at a trickier interaction. Consider the following:

let (x, y) = ((\z -> z), (\z -> z)) in
case b of
  True -> (x 'a', y False)
  False -> (x True, y 'b')

If MonoLocalBind is turned off, then x and y are inferred to have the type forall a. a -> a. What’s really interesting is the desugaring:

let
  xy = \@a -> ((\(z::a) -> z), (\(z::a) -> z))
  x = \@a -> fst (xy @a)
  y = \@a -> snd (xy @a)
in
case b of
  True -> (x @Char 'a', y @Bool False)
  False -> (x @Bool True, y @Char 'b')

An alternative desugaring might have been:

let
  xy = ((\@a (z::a) -> z), (\@a (z::a) -> z))
  x = fst xy
  y = snd xy
in
case b of
  True -> (x @Char 'a', y @Bool False)
  False -> (x @Bool True, y @Char 'b')

Which does indeed give the same type to x and y. But this alternative desugaring can’t be produced because of how the typechecker works: remember that the typechecker only considers generalisation (which corresponds to adding \@a -> in the desugaring) at the root of a let-binding, which isn’t the case of the alternative desugaring.

So, what does this all have to do with us anyway? Well, here’s the thing. If the alternative desugaring can be transformed into a case (in the strict case):

case ((\@a (z::a) -> z), (\@a (z::a) -> z)) of
  (x, y)  ->
    case b of
      True -> (x @Char 'a', y @Bool False)
      False -> (x @Bool True, y @Char 'b')

Then the actual desugaring cannot:

-- There's no type to use for `??`
case (\@a -> ((\(z::a) -> z), (\(z::a) -> z))) @?? of
  (x, y) ->
    -- Neither `x` nor `y` is polymorphic so the result doesn't
    -- type either.
    case b of
      True -> (x @Char 'a', y @Bool False)
      False -> (x @Bool True, y @Char 'b')

So we’re back to the situation of the lazy pattern matching: we might be able to make the desugaring linear without endangering soundness, but it’s too much effort.

So pattern let-bindings can’t be both linear and polymorphic (here again, variables are fine). This means in practice that:

In let %1 (x, y) = the type of x and y is never generalised.
Starting with GHC 9.10, LinearTypes will imply MonoLocalBinds.
If you turn MonoLocalBinds off, however, and the typechecker infers a polymorphic type for let (x, y) = rhs then rhs will count as being used Many times.

What’s really interesting, here, is that this limitation follows mainly from the typechecking algorithm, which can’t infer every permissible type otherwise it wouldn’t always terminate.

Wrapping up

This is the end of our little journey in the interactions of typechecking, desugaring, and linear types. Maybe you’ve learned a thing or two about what makes GHC tick. Hopefully you’ve learned something about why the design of linear types is how it is.

What I’d like to stress before I leave you is that the design of GHC is profound. We’re not speaking about accidents of history that makes some features harder to implement than they need to. It’s quite possible that a future exists where we can lift some of the limitations that I discussed (not Ur being a newtype, though, this one is simply unsound). But the reason why they are hard to lift is that they require actual research to understand what the relaxed condition implies. They are hard to lift because they are fundamentally hard problems! GHC forces you to think about it. It forces you to deeply understand the features you implement, and tell you to go do your research if you don’t.

This is not actually how strict lets used to desugar. Instead they desugared to:
```
let
  xy = bakeTurkey
  x = fst xy
  y = snd xy
in
xy `seq` body
```
Which will eventually optimise to the appropriate case expression. This desugaring generalises to (mutually) recursive strict bindings, but these aren’t relevant for us as recursive definition lets aren’t linear.

As part of the linear let patch, I did change the desugaring to target a case directly in the non-recursive case, in order to support linear lets.↩

About the authors

Arnaud SpiwackArnaud is Tweag's head of R&D. He described himself as a multi-classed Software Engineer/Constructive Mathematician. He can regularly be seen in the Paris office, but he doesn't live in Paris as he much prefers the calm and fresh air of his suburban town.

If you enjoyed this article, you might be interested in joining the Tweag team.

This article is licensed under a Creative Commons Attribution 4.0 International license.

← BazelCon Community Day - Munich Evaluating Retrieval in RAGs: A Gentle Introduction →