Type Inference

Albert Y. C. Lai, trebla [at] vex [dot] net

We learn how the computer solves type inference to gain understanding (so it is not “magic”) and help explain type errors and inferred types.

Preparation

A type inference algorithm uses tools that are not obvious at first sight. To help motivate them, I start with showing how an experienced or clever person does type inference heuristically, which calls for the same tools.

Creating unknowns

Infer the type of \x -> x

I don’t know what should be the type of x, but I can call it u, consider it unknown for now. Then I can deduce:

(\x -> x) :: u -> u

I will use names starting with u for unknowns, e.g., u1, u2. Note: They are not necessarily type variables for polymorphism—to be determined later. (In contrast, I will write polymorphic types like “∀a. a → a”.)

Type environment

Infer the type of \x y -> if False then (x, 5) else (True, y)

Like before, choose unknown u1 for x, u2 for y. Then I would really like to zoom into the function body

if False then (x, 5) else (True, y)

but with the knowledge x::u1, y::u2 at the back of my mind.

This suggests that a person or an algorithm keeps track of a “type environment” consisting of the variables in scope and their types for easy lookup.

It also helps with local variables from “let”. Infer the type of

let b=False in \x y -> if b then (x, 5) else (True, y)

After finishing b=False deducing b::Bool, I zoom into the body

\x y -> if b then (x, 5) else (True, y)

with b::Bool in the type environment.

Solving for unknowns

Assume 5::Integer, True::Bool

Infer the type of \x y -> if False then (x, 5) else (True, y)

After choosing unknowns and zooming in etc., eventually it comes down to:

(x,5) :: (u1,Integer)

(True,y) :: (Bool,u2)

if-then-else wants them to be the same type, so:

(u1,Integer) = (Bool,u2)

Solving that equation gives u1 = Bool, u2 = Integer.

Answer: Bool -> Integer -> (Bool, Integer)

Type mismatch error

\x -> if False then (x, 5) else (True, x)

Similar thought process but the equation is

(u1,Integer) = (Bool,u1)

This means Integer = Bool. This is a type error, the “type mismatch” kind.

“Infinite type” error

The other kind of type errors goes like this:

\x -> if False then x else [x, x]

I use unknown u1 for x. The equation to solve goes like

u1 = [u1]

Now, theoretically there is nothing wrong with that; the feature name is “infinite types”, and a few (very few) academic toy languages have it.

But people find that it is rarely useful, it is hard to follow even when it is useful, and if a piece of practical code seems to do it, more than 99.9% of the time it’s a typo rather than an intention, e.g., the author meant “then [x]”. So all practical languages and almost all academic languages ban it. (Statically typed languages, that is.)

In general, if an equation to solve goes like

u1 = … something that mentions u1 …

and is not simply “u1 = u1”, we say that it is a type error, and give the reason “infinite type”. You sometimes also see the wording “occurs check”; it means that the algorithm checks whether u1 occurs on the right side, and finds it.

In the rare case that you find an infinite type useful, the recommendation is to define an ordinary recursive wrapper type:

newtype T = MkT [T]
-- so MkT :: [T] -> T

unT :: T -> [T]
unT (MkT list) = list

You now have to keep wrapping and unwrapping, but the logic is clearer because the code is explicit about when to think of one single T, when to think of a list of T’s.

The tools needed

I need to create new unknowns all the time.

I will run into equations like (u1,Integer) = (Bool,u2), and I need to solve them and find out u1 = Bool, u2 = Integer. This solving is called “unification”; I was unifying (u1,Integer) with (Bool,u2).

Because of that, I need a mutable table of solved unknowns. Every time I solve an equation, it causes an update. Since each entry is telling me to replace an unknown by its solution, this table is a table of substitutions.

I need the type inference algorithm to take one more parameter: a type environment to tell me which variables are in scope and what their types are. It is also a good place to store the types of library functions.

Unification

Unification solves two problems in one go: Checking whether two types match up, and if they contain unsolved unknowns, solve them, i.e., find out whether and what can be substituted for the unsolved unknowns to make the two types match up.

It doesn’t return any value. It gives its answer by how it terminates: If the two types match up (perhaps after adding substitutions to the table), return normally; if not, throw an error.

Substitutions

A mutable table of substitutions is maintained, initially empty. Every call to the unification algorithm can update the table. Notation for substitutions: “u := type”.

For clarity, I will make sure that solved unknowns don’t appear in the RHS, e.g., I won’t let this table happen:

u1 := [u2] (I won’t write this, I’ll write u1 := [Integer])
u2 := Integer

Remark: Real implementations allow this for efficiency and play clever tricks elsewhere to compensate. But I am going for least confusion for beginners.

Notation for applying all substitutions in the current table to rewrite a type: “apply-subst type”. This returns the rewritten type. I omit coding it up. Example: If the current table is

u3 := Integer
u2 := u1 → u1

then apply-subst (u2 → u4 → Bool) returns ((u1→u1) → u4 → Bool).

Unification algorithm

The type inference algorithm is not always aware that an unknown is already solved and has an entry in the substitution table. It can call the unification algorithm with solved unknowns. For clarity, I first use apply-subst to get rid of them:

unify T1 T2:
unify-intern (apply-subst T1) (apply-subst T2)

Then the algorithm for unify-intern can assume that unknowns in inputs are unsolved.

Remark: Real implementations don’t do this, they play clever tricks elsewhere to compensate.

unify-intern is just structural recursion on the syntax of type expressions:

unify-intern u T
or
unify-intern T u
where u is an unknown:
    if T=u:
        return
    else if u occurs in T:
        infinite type error
    else:
        (update table)
        add u:=T to the table
        in the rest of the table, replace u by T

(If it is unify-intern u2 u4, should you add u2:=u4 or u4:=u2?
Answer: Doesn’t matter, up to you.)

unify-intern TC1 TC2
where TC1 and TC2 are type constants, e.g., Integer, Bool, Char:
    if TC1 ≠ TC2:
        type mismatch error
    else
        return

unify-intern (Maybe T1) (Maybe T2)
or
unify-intern [T1] [T2]
or similar cases:
    unify-intern T1 T2

unify-intern (T1 → S1) (T2 → S2)
or
unify-intern (T1, S1) (T2, S2)
or similar cases:
    unify-intern T1 T2
    (Unknowns in S1 and S2 may now become solved, so I need apply-subst:)
    unify-intern (apply-subst S1) (apply-subst S2)

unify-intern (T1, S1, R1) (T2, S2, R2):
    similar, you get the idea

All other cases:
(e.g., unify-intern Integer Bool, unify-intern [u1] (Maybe u2))
    type mismatch error

Example

In this example, I start with a non-empty substitution table: It already has an entry “u1 := u2 → u3” (imagine that it was caused by a previous call to unify). We are now requested to unify u1 with u3→u4.

	Current table: u1 := u2 → u3
unify u1 (u3→u4) unify-intern (u2→u3) (u3→u4) unify-intern u2 u3 add u3:=u2 replace u3 in the rest of the table
	Updated table: u1 := u2 → u2 u3 := u2
unify-intern (apply-subst u3) (apply-subst u4) unify-intern u2 u4 add u4:=u2 replace u4 in the rest of the table
	Updated table: u1 := u2 → u2 u3 := u2 u4 := u2

Later on, in example steps of the type inference algorithm, when it comes to unification, I will just write the call and the updated table (usually easy to see), like this:

	Current table: u1 := u2 → u3
unify u1 (u3→u4) unify-intern (u2→u3) (u3→u4)
	Updated table: u1 := u2 → u2 u3 := u2 u4 := u2

Type inference algorithm

Notation:

“infer env term”: Procedure call to infer the type of term; look up variables and types in env, the type environment. The answer is meant to be monomorphic, i.e., if there are unsolved unknowns, let them be, don’t generalize to a polymorphic type. Example: If term is \x -> x, return u1→u1, leaving u1 as a new unsolved unknown.

Again, for least confusion for beginners, we also use apply-subst on all answers before returning them.

Notation for environments: For example, an environment that says “the type of x is u3, the type of y is Bool” is written “{x::u3, y::Bool}”. The empty environment is written “{}”. When I clone an environment and add something, I write like “env plus z::u2” to add z of type u2.

The type inference algorithm is a structural recursion over the syntax of term expressions. Here are the cases:

Literal

Literals have basically hardcoded types, so just return them. For simplicity, I ignore type classes, so for example I just pretend that 4 is always Integer.

infer env True
    return Bool

infer env 4
    return Integer
    (For simplicity, I don’t cover type classes, I just stick to Integer)

Similarly for most other literals.

Some literals are data constructors of parametrized algebraic data types, e.g., Nothing, the empty list []. I postpone them to the section on algebraic data types later.

Variable

The type of a variable is simply given in the environment, or else it’s a “variable not in scope” error—the 3rd and final source of errors in type inference. (If you care about certain library functions, be sure to include them in the initial environment!)

If the type is polymorphic, there is some follow-up work to do. But first why it is possible: 1. many library functions have polymorphic types; 2. the let construct (shown later) can put polymorphic types in the environment.

The follow-up work is to instantiate the type variables with new unknowns. (Let the context worry about solving them.)

infer env v
where v is a variable:
    T := look up the type of v in env
    if not found, “variable not in scope” error
    if T is polymorphic:
        create new unknowns to instantiate with
        T’ := instantiate T
        (Example below.)
        return (apply-subst T’)
    else
        return (apply-subst T)

Example of finding a polymorphic type and instantiating it:

infer {x::u2, f::∀a b. a→b→a} f:
    T := look up the type of f in env
    T = ∀a b. a→b→a
    create unknowns u4 for a, u5 for b
    T’ := instantiate ∀a b. a→b→a
    T’ = u4→u5→u4
    return u4→u5→u4

Lambda

For a lambda like \v → expr, we can create an unknown, say u, for the type of v, and wait for it to be solved later. Then we can zoom into expr, but keep in mind the additional v::u; in other words, a recursive call but give it a larger environment that has the additional v::u.

If the recursive call figures out that expr has type T, that means the whole lambda has the function type u→T.

infer env (\v → expr):
    create new unknown, say u (for the type of v)
    T := infer (env plus v::u) expr
    return (apply-subst (u→T))

Remark: Although u starts as a new unknown, after the recursive call for expr, u may have been solved, so u→T may need a rewrite, even though T doesn’t.

Example: \x → x

infer {} (\x → x):
    create new unknown u1 for x
    T := infer {x::u1} x
        return u1
    T = u1
    return (apply-subst (u1→T)) = u1→u1

For lambdas of two or more parameters, you could treat as nested lambdas, e.g., \x → (\y → expr). If you find that tedious, we can shortcut it:

infer env (\v w → expr):
    create new unknowns, say uv (for v), uw (for w)
    T := infer (env plus v::uv, w::uw) expr
    return (apply-subst (uv→uw→T))

Example:

infer {} (\x y → x):
    create new unknowns ux, uy
    T := infer {x::ux, y::uy} x
        return ux
    T = ux
    return (apply-subst (ux→uy→ux)) = ux→uy→ux

Function application

For a function application like f e, we can first use recursion to find the types of f and e respectively.

f is being used as a function, so we also need to check whether its type is a function type. Its argument is e, so we also need to check whether the domain type matches the type of e. These checks are what unification is for! We don’t have any requirement on the codomain type at this point, so just create a new unknown for it.

The type of the whole f e is the codomain type, i.e., whatever the new unknown turns out to be.

infer env (f e):
    Tf := infer env f
    Te := infer env e
    (We now check Tf = Te→???)
    create new unknown u
    unify Tf (Te → u)
    return (apply-subst u)

A small example: The library has id :: ∀a. a→a. Let’s use it like “id True”.

infer {id :: ∀a. a→a} (id True) Tf := infer {id :: ∀a. a→a} id T := look up id, it’s ∀a. a→a it’s polymorphic create new unknown u1 for a T’ := instantiate ∀a. a→a T’ = u1→u1 return (apply-subst (u1→u1)) = u1→u1 Tf = u1→u1 Te := infer {id :: ∀a. a→a} True return Bool Te = Bool create new unknown u2 unify Tf (Te → u2) unify-intern (u1 → u1) (Bool → u2)
	Updated table: u1 := Bool u2 := Bool
return (apply-subst u2) = Bool

A larger example: \f x → f (f x)

infer {} (\f x → f (f x)): create new unknowns u1 (for f), u2 (for x) T1 := infer {f::u1, x::u2} (f (f x)): Tf := infer {f::u1, x::u2} f return u1 Tf = u1 T2 := infer {f::u1, x::u2} (f x) Tf2 := infer {f::u1, x::u2} f return u1 Tf2 = u1 Tx := infer {f::u1, x::u2} x return u2 Tx = u2 create new unknown u3 unify Tf2 (Tx → u3) unify-intern u1 (u2 → u3)
	Updated table: u1 := u2 → u3
return (apply-subst u3) = u3 T2 = u3 create new unknown u4 unify Tf (T2 → u4) unify-intern (u2 → u3) (u3 → u4)
	Updated table: u1 := u2 → u2 u3 := u2 u4 := u2
return (apply-subst u4) = u2 T1 = u2 return (apply-subst (u1→u2→T1)) = (u2→u2)→u2→u2

For multiple parameters, e.g., f e1 e2, again you can either treat as two function applications, (f e1) e2, or do the obvious shortcut.

if-then-else

if-then-else requires:

The test condition must have type Bool.
Both branches must have the same type.
The type of the whole if-then-else is that same type.

infer env (if cond then e1 else e2):
    (check cond :: Bool)
    Tcond := infer env cond
    unify Tcond Bool

    (find types of e1, e2)
    Te1 := infer env e1
    Te2 := infer env e2

    (but they have to be the same type!)
    unify Te1 Te2
    return (apply-subst Te1)

Algebraic data types (unparameterized)

An algebraic data type definition creates two aspects:

You can make values by data constructors.
You can use values by pattern matching.

Accordingly, it means adding two groups of inference rules. I will illustrate by an example type. Then you extrapolate to other types.

data MI = Non | Has Integer

Making:

infer env Non
    return MI

infer env (Has expr):
    (check expr :: Integer first)
    T := infer env expr
    unify T Integer
    return MI

Using: For simplicity I just cover case-expressions with simple patterns:

case eMI of { Non -> e0 ; Has v -> e1 }

This is like if-then-else on steroid. We require:

eMI must have type MI
In the “Has v” branch, we assume v::Integer when working on e1.
All branches must have the same type.
The type of the whole case-expression is that same type.

So here we go:

infer env (case eMI of { Non → e0 ; Has v → e1 }):
    (check eMI :: MI)
    T := infer env eMI
    unify T MI

    (infer e0)
    T0 := infer env e0

    (infer e1, note that local var v::Integer is in scope)
    T1 := infer (env plus v::Integer) e1

    (but they have to be the same type!)
    unify T0 T1
    return (apply-subst T0)

Algebraic data types (parameterized)

Parameterized algebraic data types are similar, but there are additional unknowns to create for instantiating the type variables. Example:

data Maybe a = Nothing | Just a

The data constructors are polymorphic—you can think of them as:

Nothing :: ∀a. Maybe a
Just :: ∀a. a -> Maybe a

Making:

infer env Nothing
    create new unknown u
    return (Maybe u)

infer env (Just expr)
    T := infer env expr
    return (Maybe T)

Using:

infer env (case eM of { Nothing → e0; Just v → e1 }):
    (check eM :: Maybe ???)
    T := infer env eM
    create new unknown u
    unify T (Maybe u)

    (infer e0)
    T0 := infer env e0

    (infer e1, note local var v::u since eM = Just v :: Maybe u)
    T1 := infer (env plus v::u) e1

    (but they have to be the same type!)
    unify T0 T1
    return (apply-subst T0)

Extrapolate this to standard parameterized types such as lists and tuples.

The 2-tuple type can be understood as

data Tuple2 a b = MkTuple2 a b

except for the syntactic difference between (e1, e2) :: (T1, T2) and MkTuple2 e1 e2 :: Tuple2 T1 T2.

Likewise, with a syntactic change, the list type can be understood as

data List a = Nil | Cons a (List a)

let (single definition, non-recursive)

People expect these to be legal:

let f = \x -> x       -- expect f :: ∀a. a -> a
in (f True, f 4)

let e = []            -- expect e :: ∀a. [a]
in (True : e, 4 : e)

This means when “let” is used for local definitions (and similarly for “where”), we need to generalize an inferred type to a polymorphic type, e.g., \x -> x is normally just u→u with an unknown u, but here it is generalized to ∀a. a → a so f is polymorphic.

infer env (let v = rhs in body)
    Tv := infer env rhs
    Tvg := generalize env Tv
    infer (env plus v::Tvg) body

generalize env T:
    env-updated := like env but all types are rewritten by apply-subst
    (Unknowns in env may have become solved recently,
    so we need env-updated for the following to make sense.)

    for each unknown ui in T:
        if ui doesn’t also appear in env-updated:
            create new type variable, say ai
    return (∀(the ai’s) . (T but replace each ui by ai))

Example of having f = \x -> x and using it two ways:

infer {} (let f = \x → x in (f True, f 4)) T := infer {} (\x → x) … T = u → u Tg := generalize {} (u → u) env-updated := {} create new type variable a for u in (u → u) return (∀a. (u → u but replace u by a)) = ∀a. a → a Tg = ∀a. a → a infer {f :: ∀a. a → a} (f True, f 4)
T1 := infer {f :: ∀a. a → a} (f True) Tf := infer {f :: ∀a. a → a} f (∀a. a → a needs instantiation) create new unknown u2 return (u2 → u2) Tf = (u2 → u2) Te := infer {f :: ∀a. a → a} True return Bool Te = Bool create new unknown u3 unify Tf (Te → u3) unify-intern (u2 → u2) (Bool → u3)
	Updated table: u2 := Bool u3 := Bool
return (apply-subst u3) = Bool T1 = Bool T2 := infer {f :: ∀a. a → a} (f 4) Tf := infer {f :: ∀a. a → a} f create new unknown u4 return (u4 → u4) Tf = (u4 → u4) Te := infer {f :: ∀a. a → a} 4 return Integer Te = Integer create new unknown u5 unify Tf (Te → u5) unify-intern (u4 → u4) (Integer → u5)
	Updated table: u2 := Bool u3 := Bool u4 := Integer u5 := Integer
return (apply-subst u5) = Integer T2 = Integer return (apply-subst (T1, T2)) = (Bool, Integer)

Example of why we don’t generalize unknowns that appear in env-updated:

infer {} ((\y → let v = y in 3 + v) False)

We expect a type error because 3 + False should be wrong. Go through some important events in type inference to see how it should unfold:

Create unknown uy for y.
Under environment {y::uy}, work on v = y, conclude v :: uy. uy is still unsolved (haven’t looked at the False yet).

If we generalized to v :: ∀a. a, then there would be no error:

With v polymorphic, 3 + v :: Integer. But this doesn’t affect uy.
The lambda has type uy → Integer.
Function application unifies uy with Bool (from False), discovers uy = False, but no error.

But we don’t want that! There should be an error somewhere. That can only happen by keeping x :: uy as is:

With v :: uy, 3 + v forces uy := Integer.
The lambda has type Integer → Integer.
Function application unifies Integer (was uy) with Bool (from False), discovers type mismatch, as wanted.

The difference from the f = \x → x example:

In f :: u → u, u is really independent (not involved in the environment).
In v :: uy, uy is involved in y :: uy in the environment, not independent.

It is safe to generalize independent unknowns, but unsafe to generalize unknowns that are still used in the environment.

let (single definition, recursive)

If v = rhs is self-recursive, we need v to be in the environment when inferring rhs. We don’t yet know its type, so just create a new unknown! But later be sure to unify the new unknown with the inferred type of rhs.

infer env (let v = rhs in body)
where the definition is recursive, i.e., rhs contains v
    create new unknown uv for v
    Tv := infer (env plus v::uv) rhs
    unify uv Tv
    Tvg := generalize env Tv (Note: env does not have v::uv.)
    infer (env plus v::Tvg) body

Example: f = \x → x : f x, which we expect to be ∀a. a → [a].

infer {} (let f = \x → x : f x in …) create new unknown uf for f Tf := infer {f :: uf} (\x → x : f x) create new unknown ux for x Tc := infer {x :: ux, f :: uf} (x : f x) Tx := infer {x :: ux, f :: uf} x return ux Tfx := infer {x :: ux, f :: uf} (f x) Tf := infer {x :: ux, f :: uf} f return uf Tx2 := infer {x :: ux, f :: uf} x return ux Tx2 = ux create new unknown ufx unify Tf (Tx2 → ufx) unify-intern uf (ux → ufx)
	Updated table: uf := ux → ufx
return (apply-subst ufx) = ufx Tfx = ufx unify [Tx] Tfx unify-intern [ux] ufx
	Updated table: uf := ux → [ux] ufx = [ux]
return (apply-subset [Tx]) = [ux] Tc = [ux] return (apply-subst (ux → Tc)) = ux → [ux] Tf = ux → [ux] unify uf Tf unify-intern (ux → [ux]) (ux → [ux]) Tfg := generalize {} (ux → [ux]) (See note below for why exclude u::uf in env here.) env-updated := {} create new type variable a for ux return (∀a. a → [a]) Tfg = ∀a. a → [a] infer {f :: ∀a. a → [a]} …

let (multiple definitions, mutually recursive)

By way of example, suppose we have

let f = use g, x
    g = use h, y
    h = use g, z
    x = use y
    y = use z
    z = use x
in body

equivalently, the transposed call graph (e.g., f calls g, draw an edge from g to f), and identify strongly connected components (dotted circles):

It stands to reason that we should do:

First infer x, y, z together.

create unknowns ux, uy, uz for x, y, z
Tx := infer {x :: ux, y :: uy, z :: uz} (use y)
unify ux Tx
Ty := infer {x :: ux, y :: uy, z :: uz} (use z)
unify uy Ty
Tz := infer {x :: ux, y :: uy, z :: uz} (use x)
unify uz Tz
Txg := generalize {} Tx
Tyg := generalize {} Ty
Tzg := generalize {} Tz
Then use that to infer g, h together.

create unknowns ug, uh for g, h
Tg := infer {g :: ug, h :: uh, x :: Txg, y :: Tyg, z :: Tzg} (use h, y)
unify ug Tg
Th := infer {g :: ug, h :: uh, x :: Txg, y :: Tyg, z :: Tzg} (use g, z)
unify uh Th
Tgg := generalize {x :: Txg, y :: Tyg, z :: Tzg} Tg
Thg := generalize {x :: Txg, y :: Tyg, z :: Tzg} Th
Then use the above to infer f.

Tf := infer {g :: Tgg, h :: Thg, x :: Txg, y :: Tyg, z :: Tzg} (use g, x)
Tfg := generalize {g :: Tgg, h :: Thg, x :: Txg, y :: Tyg, z :: Tzg} Tf
Finally can infer body.

infer {f :: Tfg, g :: Tgg, h :: Thg, x :: Txg, y :: Tyg, z :: Tzg} body

In general, identify strongly connected components in the transposed call graph, put them in topological sort order, and infer each component.

Not covered this time

Programmer-provided type signatures. Type classes.

I have more Haskell Notes and Examples