~ / track F / deep theory

Recursion schemes

Advancedtheory

Recursion schemes factor the shape of a recursion out of its operation. Instead of writing yet another l‌o‌o‌p/r‌e‌c‌u‌r that walks a tree, you describe the walk once (the scheme) and pass it a small per-node function. The most famous ones — catamorphism (fold), anamorphism (unfold), hylomorphism, paramorphism — are the family of "ways to eat or grow a structure."

The idea: separate shape from operation

You already use this every day with r‌e‌d‌u‌c‌e:

loading sci

press ⌘/Ctrl-↵ or click ▶ run to evaluate

r‌e‌d‌u‌c‌e knows how to walk a list. You give it the per-step function (+) and the seed (0). The walk is reusable; the operation is specific. That's a catamorphism over a list.

Catamorphism: collapse a structure to a value

A catamorphism (Greek for "downward shape") replaces every constructor in a data structure with a function call. For lists: replace every c‌o‌n‌s with your step function and the empty list with your seed. The result is your r‌e‌d‌u‌c‌e.

For a binary tree of integers:

loading sci

press ⌘/Ctrl-↵ or click ▶ run to evaluate

tree-cata is the scheme: how to walk a tree. The two function arguments specify the operation. Swap them and you get tree-max, tree-depth, tree-to-list, prettyprint — all reusing the same walk.

Anamorphism: grow a structure from a seed

An anamorphism (Greek for "upward shape") is the dual: given a seed and a "step" that says either "stop" or "split into more seeds," produce a structure.

i‌t‌e‌r‌a‌t‌e is half of one over lists — keep applying a function to a seed:

loading sci

press ⌘/Ctrl-↵ or click ▶ run to evaluate

More interesting: build a binary tree of consecutive integers, splitting at each level:

loading sci

press ⌘/Ctrl-↵ or click ▶ run to evaluate

This is an unfold, expressed by hand. A generic ana would take a function deciding "leaf vs branch" and run it for you.

Hylomorphism: unfold then fold, in one pass

A hylomorphism is the composition of an anamorphism followed by a catamorphism — but fused so the intermediate structure is never built. Classic example: factorial as "unfold 1..n into a list, then fold with *":

fact n = cata * 1 (ana ... n)
       = hylo

In Clojure, this is what transduce is doing (see the transducers topic): fusing the production of intermediate values with their consumption, so nothing is ever materialized.

loading sci

press ⌘/Ctrl-↵ or click ▶ run to evaluate

Paramorphism: fold with access to subtrees

A paramorphism is like a catamorphism but the per-step function also sees the original substructure (not just the recursively-computed value). It's the right tool when the step needs both. Example: "tails" of a list —

tails [1 2 3] = [[1 2 3] [2 3] [3] []]

You can't compute tails with a plain fold; each step needs the rest of the list itself. A para has it on hand.

Why this matters

Reusable walks. Once you have tree-cata (or its Functor-based generalization in Haskell), every tree algorithm becomes "supply a small function" instead of "rewrite the recursion."
Termination by construction. A scheme like cata over a finite structure terminates because of its shape. You don't write the base case by hand; the scheme has it.
Optimization potential. Hylomorphisms enable deforestation / stream fusion: producer and consumer combine into one tight loop.
A vocabulary. Recognizing "this is a fold" or "this is an unfold" makes algorithms easier to read, write, and compare.

Real-world: schemes in production

In the wild	The scheme involved
JSON / EDN pretty-printers, syntax highlighters	Catamorphism over the parse tree
AST walkers in compilers (Babel, SWC, ts-morph, tools.analyzer for Clojure)	Catamorphism (collect / transform) + paramorphism (when you need the original node, not just the rewritten one)
HTML rendering pipelines, React reconciliation	Cata over the VDOM tree
Spark's Catalyst optimizer, Apache Calcite	Tree rewrites over relational-algebra trees — explicit `transform` functions ARE the scheme
Database query planners (Postgres, DuckDB)	Para over plan trees: rewrites need to see the original subplan AND its statistics
Stream fusion in GHC, Spark, Flink, Clojure transducers	Hylomorphisms: producer + consumer fused, no intermediate collection materialized
Game-tree search (chess engines, MCTS)	Anamorphism builds the search tree lazily, catamorphism scores it
Generative data tools (QuickCheck shrinking trees)	Ana grows the shrink tree on demand; cata picks the smallest failing case

Whenever you find yourself writing recursive tree-walk functions, ask: "would cata / ana / hylo name this?" Most large compilers and data systems organize their internals around exactly these schemes, often without the academic name.

Check yourself

? quiz

What's the conceptual difference between a catamorphism and a paramorphism?

Exercise

Using your tree-cata above (or rewriting it), implement these two functions without writing explicit recursion in the operation — just supply the leaf and branch functions:

tree-max: return the largest leaf value.
tree-count: count the number of leaves.

loading sci

press ⌘/Ctrl-↵ or click ▶ run to evaluate

● status: new