### Tue, 11 Jul 2006

#### Ocaml : Fold.

In a previous post, I blogged about Ocaml's
**iter** and **map**
functions and how they can be applied to arrays and lists.
In some circumstances, these functions can be used as a replacement for a
**for** loop.
However, there are some other situations where **iter** and **map** are
can only provide a non-optimal solution.
For example, here's a small program which uses Ocaml's imperative features to
calculate the sum of the elements of an integer array:

let _ = let a = [| 1 ; 2 ; 5 ; 7 ; 11 |] in let sum = ref 0 in for i = 0 to Array.length a - 1 do sum := !sum + a.(i) done ; Printf.printf "Sum : %d\n" !sum

The value **sum** is a reference to an integer which is initialized to
zero and the referenced sum is then updated inside the **for** loop.
Operating on references is a little different to operating on values; it
requires the use of the de-reference operator **"!"** to access the
referenced value and requires the use of the de-reference assignment
operator **":="** to update the referenced value.

Like the previous **iter** and **map** examples, there are a number
of places this can go wrong, even though its only a very small chunk of
code.
Here's how to use **iter** to acheive the same result:

let _ = let a = [| 1 ; 2 ; 5 ; 7 ; 11 |] in let sum = ref 0 in Array.iter (fun x -> sum := !sum + x) a ; Printf.printf "Sum : %d\n" !sum

bit thats only a small win.

Fortunately, there is a significantly better
Higher Order Function
solution to this problem, a concept called fold and implemented as
functions **fold_left** and **fold_right**.
The following example program uses both and reduces the for loop in
the first example to a single line, including the initialization of the
accumulator used to calculate the sum:

let _ = let a = [| 1 ; 2 ; 5 ; 7 ; 11 |] in let fold_left_sum = Array.fold_left (fun x y -> x + y) 0 a in Printf.printf "Fold_left sum : %d\n" fold_left_sum ; let fold_right_sum = Array.fold_right (fun x y -> x + y) a 0 in Printf.printf "Fold_right sum : %d\n" fold_right_sum

So lets have a look at a single fold_left:

Array.fold_left (fun x y -> x + y) 0 a

Obviously, the first parameter passed to **fold_left** is an anonymous
function which takes two parameters **x** and **y** and returns their
sum and the last parameter is simply the array the fold is being applied.
The second parameter, **0** in this case, is where all the magic happens.
The value **0** is the value that will be passed to the anonymous
function as the **x** parameter, the first time it is called.
For subsequent calls, the value of the **x** parameter will be the return
value of the previous call of the anonymous function.

Obviously the easiest way to visualize this is with an example that prints out the values. Here it is:

let _ = let a = [| 1 ; 2 ; 5 ; 7 ; 11 |] in let fold_left_sum = Array.fold_left ( fun x y -> Printf.printf "%4d %4d\n" x y ; x + y ) 0 a in Printf.printf "\nFold_left sum : %d\n" fold_left_sum

For those of you too lazy to try this yourselves , here is the output:

0 1 1 2 3 5 8 7 15 11 Fold_left sum : 26

There, just as I explained it.
So what about **fold_right**?
Well there are two differences and they are a little subtle so here's
the program:

let _ = let a = [| 1 ; 2 ; 5 ; 7 ; 11 |] in let fold_right_sum = Array.fold_right ( fun x y -> Printf.printf "%4d %4d\n" x y ; x + y ) a 0 in Printf.printf "\nFold_right sum : %d\n" fold_right_sum

and here's the output:

11 0 7 11 5 18 2 23 1 25 Fold_right sum : 26

The two differences are:

- With
**fold_left**the function is applied from first entry to last entry while with**fold_right**the function is applied in reverse order. - With
**fold_left**the initializer/previous result is passed as the first parameter to the applied function while with**fold_right**the initializer/previous result is passed as the second parameter.

Like **iter** and **map**, Ocaml's fold functions can reduce the number
of points of possible error in a program.
More importantly, for the code reader who understands and is comfortable
with these techniques, reading and understanding code using these functions
is quicker than reading and understanding the equvalient for loop.

Ocaml rocks!

Posted at: 20:43 | Category: CodeHacking/Ocaml | Permalink