Making Recursion More Systematic

Ad-hoc recursion

We can easily define a recursive function to add all items in a list.

let rec addAll = function
  | [] -> 0
  | x :: xs -> x + addAll xs

val addAll : int list -> int

> addAll [1;2;3];;
val it : int = 6

We can write similar functions to perform other operations that involve some or all items in a list. For example, we can try to compute the list of primes using the Sieve of Eratosthenes. Lets start with some explorations toward this end. The predefined List.filter function comes in handy:

> List.filter ;;
val it : (('a -> bool) -> 'a list -> 'a list) = <fun:clo@45-2>
> List.filter (fun _ -> true) [1;2;3];;
val it : int list = [1; 2; 3]
> List.filter (fun _ -> false) [1;2;3];;
val it : int list = []
> List.filter (fun x -> x > 2) [1;2;3];;
val it : int list = [3]
> List.filter (fun x -> x % 2 <> 0) [1;2;3];;
val it : int list = [1; 3]

Let's successively filter the known primes in ascending order:

> [2 .. 100] |> (List.filter (fun x -> x % 2 <> 0));;
val it : int list =
  [3; 5; 7; 9; 11; 13; 15; 17; 19; 21; 23; 25; 27; 29; 31; 33; 35; 37; 39; 41;
   43; 45; 47; 49; 51; 53; 55; 57; 59; 61; 63; 65; 67; 69; 71; 73; 75; 77; 79;
   81; 83; 85; 87; 89; 91; 93; 95; 97; 99]

> it |> (List.filter (fun x -> x % 3 <> 0));;        
val it : int list =
  [5; 7; 11; 13; 17; 19; 23; 25; 29; 31; 35; 37; 41; 43; 47; 49; 53; 55; 59;
   61; 65; 67; 71; 73; 77; 79; 83; 85; 89; 91; 95; 97]

> it |> (List.filter (fun x -> x % (List.head it) <> 0));;
val it : int list =
  [7; 11; 13; 17; 19; 23; 29; 31; 37; 41; 43; 47; 49; 53; 59; 61; 67; 71; 73;
   77; 79; 83; 89; 91; 97]

Note how an iteration pattern emerges: in each step, we eliminate multiples of the first element of the current sequence. The successive first elements themselves are the primes. So: 

let rec primes = function 
  | [] -> []
  | x::xs -> x :: primes (xs |> (List.filter (fun y -> y % x <> 0)))

> primes [2..100];;
val it : int list =
  [2; 3; 5; 7; 11; 13; 17; 19; 23; 29; 31; 37; 41; 43; 47; 53; 59; 61; 67; 71;
   73; 79; 83; 89; 97]

To see the actual steps, just add a printfn in the right place: 

let rec primes = function 
  | [] -> []
  | x::xs -> x :: primes (xs |> ( printfn "%A" xs; (List.filter (fun y -> y % - x <> 0)))) 

It will now print the remaining candidates after each recursive step. 

Correctness proof: structural induction on the list. 

List-Processing Functions

In the context of the org chart example, we already saw uses of the List.map and List.fold functions. 
  • Map applies a function to each item in a list: map f [x; y; ...; z] -> [f x; f y; ...; f z].
  • Fold (aka foldl or foldLeft) successively applies a function to the partial result accumulated so far and the next list item: fold f a [x; y; ...; z] = (f (... (f (f a x) y) ...) z).
  • There is also foldBack (aka foldr or foldRight), which does the same thing but starting from the back of the list: foldBack f [x; y; ...; z] a = f x (f y (... (f z a))...). This makes a difference unless f is commutative (which implies that both of its arguments are of the same type). When used with lists, foldBack usually does things in the right order but is not tail-recursive, so its recursive call stack might grow very large.
  • Filter, which we already used above, keeps only those items that satisfy the given predicate (boolean function).
Up for a challenge? Try implementing these functions yourself and see if they behave like the predefined ones!

Examples:

> List.map;;
val it : (('a -> 'b) -> 'a list -> 'b list) = <fun:clo@38-1>

> List.map (fun x -> x * x) [1;2;3];;
val it : int list = [1; 4; 9]

> List.map (fun x -> [x; x]) [1;2;3];;
val it : int list list = [[1; 1]; [2; 2]; [3; 3]]

> List.fold;;
val it : (('a -> 'b -> 'a) -> 'a -> 'b list -> 'a) = <fun:clo@45-5>

> List.foldBack;;
val it : (('a -> 'b -> 'b) -> 'a list -> 'b -> 'b) = <fun:clo@39-4>

> List.fold (+) 0 [1;2;3] ;;          
val it : int = 6

> List.foldBack (+) [1;2;3] 0;;
val it : int = 6

> List.fold (-) 0 [1;2;3] ;;    
val it : int = -6

> List.foldBack (-) [1;2;3] 0;;
val it : int = 2

> List.fold (-) 1 [1;2;3] ;;
val it : int = -5

> List.foldBack (-) [1;2;3] 1;; 
val it : int = 1

Systematic Recursion

A good question is whether we can express the sieve from above without explicit recursion. Here is a first attempt:

let primes xs = List.foldBack 
  (fun y -> fun ys -> y :: (ys |> (List.filter (fun z -> z % y <> 0)))) xs [];;

So this works, but unlike the ad-hoc recursive version above, it requires one step for each original list item instead of one step for each resulting prime! That is not so good. (With fold(Left), you would have a harder time building the result list.) 

It turns out that fold is not powerful enough to express the required pattern of recursion. Fold is a factory function for catamorphisms, which break down the data structure step-by-step to produce a result. (I hope we have some classics minors who can help with the Greek here. The last time I studied Greek myself was at this public school founded in 1604 by Jesuits.) 

We will instead define an anamorphism, which builds up a data structure from a seed and a function applied successively. The factory function for anamorphisms is 

let rec unfoldBack g b = match g b with                      
                         | Some(a, b1) -> a :: unfoldBack g b1
                         | _ -> []

val unfoldBack : ('a -> ('b * 'a) option) -> 'a -> 'b list

(Unfortunately, there is a bug in Mono's Option type implementation that affects this example. It will work in Visual Studio and in OCaml after eliminating |> and changing % to mod in primes below.)

Here, b is the seed and g is the step-by-step generator, which uses the predefined Option type to produce a pair consisting of the next item and the next seed. While fold requires you to process each item in the underlying list, unfold recurses over the result of applying function g to the seed, b1. This gives you a lot more power, as we will see below. (Btw, Option is a much cleaner alternative to null references in languages like Scala that have both.)
 
As another challenge, try defining as anamorphisms: a simple range generator, list identity, and map. (The latter two can easily be defined as catamorphisms, too.)

And finally: 

let primes = unfoldBack (function [] -> None | x :: xs -> Some(x, xs |> List.filter (fun y -> y % x <> 0)))

val primes : (int list -> int list)

> primes [2..100];;
val it : int list =
  [2; 3; 5; 7; 11; 13; 17; 19; 23; 29; 31; 37; 41; 43; 47; 53; 59; 61; 67; 71;
   73; 79; 83; 89; 97]

Yay! The beauty is that you get to filter the candidate list before each new iteration, so this behaves exactly like the original ad-hoc recursive 
version. 

Some final thoughts

  • Why not simply use ad-hoc recursion? Because that makes it considerably harder to reason about your code. For general operations such as map, fold, unfold, etc., you can prove general laws, which you can then apply to special cases. Such laws help you with correctness proofs, compiler optimizations, etc. For ad-hoc recursion, you need to come up with your own, specific proof. 
  • Why bother with this formal stuff at all? Because it provides a solid foundation for the higher-level structures we build on top, and because it trains our minds to think in a certain way and develop an intuition for flaws in those structures. In other words, it gives you an edge over those who have not studied this material.
  • Is this approach limited to lists? Not at all. As we will see in a forthcoming handout, this approach can be used on any algebraic data type. Conceptually, for each variant of the type, the fold function needs a corresponding argument; if there are more than two variants, the fold arguments are usually combined into a single structure corresponding to an algebra. (Look here for more theoretical background.)
Comments