~ / track H / applied patterns

The expression problem

Advanced

Phil Wadler coined this in 1998: can you extend a program along two axes — new data variants AND new operations — without modifying existing code, and without losing static type safety? It's a deep tension between OO and FP worldviews, and the cure (or workaround) is a story about polymorphism.

The two natural answers, each broken

FP / pattern matching. Sum types make adding a new operation trivial — write a new function that pattern matches. But adding a new variant means editing every existing function:

data Expr = Lit Int | Add Expr Expr

eval (Lit n)     = n
eval (Add a b)   = eval a + eval b
showE (Lit n)    = show n
showE (Add a b)  = showE a ++ " + " ++ showE b

;; New variant? Must edit eval AND showE AND every other function.

OO / classes. Subtyping makes adding a new variant trivial — write a new subclass. But adding a new operation means editing every existing class:

class Expr { abstract int eval(); abstract String show(); }
class Lit extends Expr { ... }
class Add extends Expr { ... }

// New operation? Must edit Expr AND every subclass.

Each side is open along one axis and closed along the other. The expression problem asks: can we be open along both?

Visualized as a table

              eval    show    optimize    typecheck
Lit          [✓]      [✓]      [ ]         [ ]
Add          [✓]      [✓]      [ ]         [ ]
Mul          [ ]      [ ]      [ ]         [ ]
Var          [ ]      [ ]      [ ]         [ ]
  • FP fills the table column by column: adding a column (operation) is easy; adding a row (variant) means touching every column.
  • OO fills the table row by row: adding a row is easy; adding a column means touching every row.

The expression problem: can we fill the table cell by cell, in any order, with no modification of existing cells?

Solutions

ApproachWhere
Type classes (open-world)Haskell — add an instance for a new type without touching the type; add a class for a new operation without touching existing instances
MultimethodsCLOS, Clojure — dispatch is fully external; both axes open
ProtocolsClojure — like type classes; you can extend-protocol to existing types
Object algebras (Oliveira & Cook 2012)OO with abstract factories — clever but heavy
Tagless finalHaskell, Scala — see [[monad-transformers]]
Visitor patternOO workaround — but requires modifying the visitor interface to add a new variant; only half a solution

Clojure's protocols: a clean example

loading sci
press ⌘/Ctrl-↵ or click ▶ run to evaluate

Both axes are open. Adding a row OR a column requires only new code, never modification of old code. That's the goal.

Multimethods: even more open

If protocol dispatch (single dispatch on the first argument's type) isn't enough, multimethods dispatch on an arbitrary function of all arguments:

(defmulti collide (fn [a b] [(:tag a) (:tag b)]))
(defmethod collide [:ship :ship] [_ _] ...)
(defmethod collide [:ship :asteroid] [_ _] ...)

Now you can add a new dispatch case for [:ship :wormhole] without touching any existing method. The double-dispatch expression problem (binary operations) goes away.

Why this matters in practice

  • Big enterprise codebases evolve along both axes. You add a new entity type AND new behavior on existing entities, constantly. Languages that only support one axis force ugly workarounds (Visitor pattern, type switches, conditional logic).
  • Libraries you don't own. With type classes / protocols, you can extend somebody else's type with your own behavior — without forking. With closed mechanisms (subtyping), you can't.
  • The "Christmas tree of if isinstance(x, Foo)" anti-pattern in mainstream OO is the symptom of the expression problem unsolved.

Connection to [[polymorphism-taxonomy]]

  • Subtyping is the OO axis-open mechanism — open to new variants only.
  • Ad-hoc polymorphism (type classes, protocols, multimethods) is the FP axis-open mechanism — open to new operations and (with extend-protocol) new variants.

The lesson: ad-hoc polymorphism is more powerful than subtype polymorphism for evolving software, despite OO orthodoxy's emphasis on inheritance.

Check yourself

? quiz

Why does pattern-matching on a sum type FAIL the expression problem?

Exercise

Take a tiny expression language Expr = Lit Int | Add Expr Expr with one operation eval. Add (1) a new operation pretty-print and (2) a new variant Mul Expr Expr. Do it in two languages:

  • A language with subtype polymorphism (Java/C# — class per variant).
  • A language with ad-hoc polymorphism (Clojure protocols, Haskell type classes).

Note which language forces you to edit existing code and which only requires adding new code.

 status: new