~ / track E / deep theory

Recursion schemes

Advanced

Recursion schemes factor the shape of a recursion out of its operation. Instead of writing yet another loop/recur that walks a tree, you describe the walk once (the scheme) and pass it a small per-node function. The most famous ones — catamorphism (fold), anamorphism (unfold), hylomorphism, paramorphism — are the family of "ways to eat or grow a structure."

The idea: separate shape from operation

You already use this every day with reduce:

loading sci
press ⌘/Ctrl-↵ or click ▶ run to evaluate

reduce knows how to walk a list. You give it the per-step function (+) and the seed (0). The walk is reusable; the operation is specific. That's a catamorphism over a list.

Catamorphism: collapse a structure to a value

A catamorphism (Greek for "downward shape") replaces every constructor in a data structure with a function call. For lists: replace every cons with your step function and the empty list with your seed. The result is your reduce.

For a binary tree of integers:

loading sci
press ⌘/Ctrl-↵ or click ▶ run to evaluate

tree-cata is the scheme: how to walk a tree. The two function arguments specify the operation. Swap them and you get tree-max, tree-depth, tree-to-list, prettyprint — all reusing the same walk.

Anamorphism: grow a structure from a seed

An anamorphism (Greek for "upward shape") is the dual: given a seed and a "step" that says either "stop" or "split into more seeds," produce a structure.

iterate is half of one over lists — keep applying a function to a seed:

loading sci
press ⌘/Ctrl-↵ or click ▶ run to evaluate

More interesting: build a binary tree of consecutive integers, splitting at each level:

loading sci
press ⌘/Ctrl-↵ or click ▶ run to evaluate

This is an unfold, expressed by hand. A generic ana would take a function deciding "leaf vs branch" and run it for you.

Hylomorphism: unfold then fold, in one pass

A hylomorphism is the composition of an anamorphism followed by a catamorphism — but fused so the intermediate structure is never built. Classic example: factorial as "unfold 1..n into a list, then fold with *":

fact n = cata * 1 (ana ... n)
       = hylo

In Clojure, this is what transduce is doing (see the transducers topic): fusing the production of intermediate values with their consumption, so nothing is ever materialized.

loading sci
press ⌘/Ctrl-↵ or click ▶ run to evaluate

Paramorphism: fold with access to subtrees

A paramorphism is like a catamorphism but the per-step function also sees the original substructure (not just the recursively-computed value). It's the right tool when the step needs both. Example: "tails" of a list —

tails [1 2 3] = [[1 2 3] [2 3] [3] []]

You can't compute tails with a plain fold; each step needs the rest of the list itself. A para has it on hand.

Why this matters

  • Reusable walks. Once you have tree-cata (or its Functor-based generalization in Haskell), every tree algorithm becomes "supply a small function" instead of "rewrite the recursion."
  • Termination by construction. A scheme like cata over a finite structure terminates because of its shape. You don't write the base case by hand; the scheme has it.
  • Optimization potential. Hylomorphisms enable deforestation / stream fusion: producer and consumer combine into one tight loop.
  • A vocabulary. Recognizing "this is a fold" or "this is an unfold" makes algorithms easier to read, write, and compare.

Real-world: schemes in production

In the wildThe scheme involved
JSON / EDN pretty-printers, syntax highlightersCatamorphism over the parse tree
AST walkers in compilers (Babel, SWC, ts-morph, tools.analyzer for Clojure)Catamorphism (collect / transform) + paramorphism (when you need the original node, not just the rewritten one)
HTML rendering pipelines, React reconciliationCata over the VDOM tree
Spark's Catalyst optimizer, Apache CalciteTree rewrites over relational-algebra trees — explicit transform functions ARE the scheme
Database query planners (Postgres, DuckDB)Para over plan trees: rewrites need to see the original subplan AND its statistics
Stream fusion in GHC, Spark, Flink, Clojure transducersHylomorphisms: producer + consumer fused, no intermediate collection materialized
Game-tree search (chess engines, MCTS)Anamorphism builds the search tree lazily, catamorphism scores it
Generative data tools (QuickCheck shrinking trees)Ana grows the shrink tree on demand; cata picks the smallest failing case

Whenever you find yourself writing recursive tree-walk functions, ask: "would cata / ana / hylo name this?" Most large compilers and data systems organize their internals around exactly these schemes, often without the academic name.

Check yourself

? quiz

What's the conceptual difference between a catamorphism and a paramorphism?

Exercise

Using your tree-cata above (or rewriting it), implement these two functions without writing explicit recursion in the operation — just supply the leaf and branch functions:

  1. tree-max: return the largest leaf value.
  2. tree-count: count the number of leaves.
loading sci
press ⌘/Ctrl-↵ or click ▶ run to evaluate
 status: new