~ / track F / deep theory

Hindley-Milner type inference

Advancedtheory

Hindley-Milner (HM) is the type system at the heart of ML, OCaml, Haskell, Elm, and Rust's let-polymorphism. Its superpower: you write zero type annotations, and the compiler infers the most general type that fits. Every expression has a principal type — a type that subsumes every other valid typing. The algorithm that finds it is Algorithm W.

The four ingredients

Piece	What it does
Type variables (`a`, `b`, …)	Unknowns to be solved for
Type constants (`Int`, `Bool`, `String`)	Concrete types
Type constructors (-‌>, `List`, `Maybe`)	Build types from types
Type schemes (`∀a. a -> a`)	Universally-quantified, "let-bound" polymorphic types

A type scheme is a type with universally-quantified variables on the outside. ∀a. a -> a is the type of id. It's more general than Int -> Int because Int is just one of the things a can be.

The algorithm in one paragraph

For each subexpression, generate a fresh type variable. As you walk the expression, emit equations that the typing must satisfy: a function applied to an argument means "the function's input type equals the argument's type." Then unify all the equations — solve for the variables — using Robinson's algorithm. If unification succeeds, you have your principal type. If it fails (you tried to unify Int with Bool), it's a type error.

λx. x       ⇒ fresh α for x;  body has type α;  function: α → α
λx. x + 1   ⇒ fresh α; body forces α = Int;     function: Int → Int
λf. λx. f x ⇒ α → β  for f,  α for x;            function: (α → β) → α → β

That last one is principally polymorphic — no constraints force α or β, so they stay universally quantified.

Let-generalization (the trick that makes it useful)

The real win of HM is l‌e‌t. When you bind let id = λx. x, the type α → α is generalized to the scheme ∀α. α → α. Each use site of id gets a fresh instantiation:

let id = λx. x in
  (id 1,           ;; α := Int
   id "hello")     ;; α := String — fresh, no conflict with the line above

Without generalization, the first use would fix α := Int and the second would fail. With it, id is genuinely polymorphic at each use.

What HM gives up — and how languages bring it back

Pure HM doesn't have:

Higher-rank polymorphism — (∀a. a -> a) -> ... (Haskell adds it via extensions; you need annotations).
Type classes / ad-hoc polymorphism — Haskell extends HM with classes; the resolution is separate from unification.
Subtyping — HM is equational, not inequational; adding subtyping breaks principal types.
Dependent types — types can't depend on values in pure HM.

The languages you actually use are all "HM + something." OCaml adds objects and modules; Haskell adds classes and GADTs; Rust adds lifetimes and trait resolution. The HM core stays as the spine.

Clojure and the inference question

Clojure is dynamically typed — there is no inferred type. But the idea of "principal type of an expression" is still useful as a documentation discipline. core.typed (now sunsetted but instructive) added an HM-style annotated layer; Typed Clojure inferred locally and required annotations at function boundaries.

You can think of any function you write as having a "would-be HM type" sitting just above it:

;; (defn map [f xs] ...)
;; (a → b) → [a] → [b]

That mental model — "if this were ML, what would its principal type be?" — catches many shape bugs before you write the code.

Try it in Clojure: write the type, then the code

A surprisingly effective discipline: write the principal HM type as a comment, then implement to match. Most "I forgot a case" bugs evaporate.

loading sci

press ⌘/Ctrl-↵ or click ▶ run to evaluate

If the function ever forces a type variable into two incompatible shapes, you'll see the bug at runtime in a way that the HM type would have caught statically. Once you start thinking in principal types, you write a lot fewer of those bugs.

Real-world: HM in production

In the wild	Where it shows up
OCaml at Jane Street, Bloomberg, Tezos, Facebook (Hack)	Trading systems, compilers, smart-contract VMs — all leaning on HM to keep large refactors safe
Haskell at GitHub Semantic, Mercury, Tweag, Standard Chartered	Static analyzers, financial systems where a class of bugs is unacceptable
Elm & PureScript	Frontend codebases where "no runtime errors" is a product feature
F# at Walmart, Tesla, Jet	.NET workloads using ML-style inference inside a Microsoft stack
Rust's let-polymorphism	Lifetimes and traits add layers, but the `let x = ...` inference engine is HM at the core
TypeScript local inference	A weaker, subtyping-tolerant cousin — but the "infer the most general type" idea came from HM

The export from HM into mainstream programming is "you can have a real static type system without writing type annotations everywhere" — that single ergonomic win seeded an entire industry of typed languages.

Check yourself

? quiz

Why does let-generalization matter to HM's expressiveness?

Exercise

By hand, infer the principal type of:

λf. λg. λx. f (g x)

(Hint: this is function composition. The answer should match Haskell's (.) :: (b -> c) -> (a -> b) -> a -> c.) Write down each equation as you go.

● status: new