m3ga blog

Tue, 11 Jul 2006

Ocaml : Fold.

In a previous post, I blogged about Ocaml's iter and map functions and how they can be applied to arrays and lists. In some circumstances, these functions can be used as a replacement for a for loop. However, there are some other situations where iter and map are can only provide a non-optimal solution. For example, here's a small program which uses Ocaml's imperative features to calculate the sum of the elements of an integer array:

let _ =
    let a = [| 1 ; 2 ; 5 ; 7 ; 11 |] in
    let sum = ref 0 in
    for i = 0 to Array.length a - 1 do
        sum := !sum + a.(i)
    done ;
    Printf.printf "Sum : %d\n" !sum

The value sum is a reference to an integer which is initialized to zero and the referenced sum is then updated inside the for loop. Operating on references is a little different to operating on values; it requires the use of the de-reference operator "!" to access the referenced value and requires the use of the de-reference assignment operator ":=" to update the referenced value.

Like the previous iter and map examples, there are a number of places this can go wrong, even though its only a very small chunk of code. Here's how to use iter to acheive the same result:

let _ =
    let a = [| 1 ; 2 ; 5 ; 7 ; 11 |] in
    let sum = ref 0 in
    Array.iter (fun x -> sum := !sum + x) a ;
    Printf.printf "Sum : %d\n" !sum

bit thats only a small win.

Fortunately, there is a significantly better Higher Order Function solution to this problem, a concept called fold and implemented as functions fold_left and fold_right. The following example program uses both and reduces the for loop in the first example to a single line, including the initialization of the accumulator used to calculate the sum:

let _ =
    let a = [| 1 ; 2 ; 5 ; 7 ; 11 |] in
    let fold_left_sum = Array.fold_left (fun x y -> x + y) 0 a in
    Printf.printf "Fold_left sum  : %d\n" fold_left_sum ;
    let fold_right_sum = Array.fold_right (fun x y -> x + y) a 0 in
    Printf.printf "Fold_right sum : %d\n" fold_right_sum

So lets have a look at a single fold_left:

Array.fold_left (fun x y -> x + y) 0 a

Obviously, the first parameter passed to fold_left is an anonymous function which takes two parameters x and y and returns their sum and the last parameter is simply the array the fold is being applied. The second parameter, 0 in this case, is where all the magic happens. The value 0 is the value that will be passed to the anonymous function as the x parameter, the first time it is called. For subsequent calls, the value of the x parameter will be the return value of the previous call of the anonymous function.

Obviously the easiest way to visualize this is with an example that prints out the values. Here it is:

let _ =
    let a = [| 1 ; 2 ; 5 ; 7 ; 11 |] in
    let fold_left_sum = Array.fold_left
    (   fun x y ->
            Printf.printf "%4d %4d\n" x y ;
            x + y
            )
        0 a
    in
    Printf.printf "\nFold_left sum  : %d\n" fold_left_sum

For those of you too lazy to try this yourselves :-) , here is the output:

   0    1
   1    2
   3    5
   8    7
  15   11

Fold_left sum  : 26

There, just as I explained it. So what about fold_right? Well there are two differences and they are a little subtle so here's the program:

let _ =
    let a = [| 1 ; 2 ; 5 ; 7 ; 11 |] in
    let fold_right_sum = Array.fold_right
    (   fun x y ->
            Printf.printf "%4d %4d\n" x y ;
            x + y
            )
        a 0
    in
    Printf.printf "\nFold_right sum : %d\n" fold_right_sum

and here's the output:

  11    0
   7   11
   5   18
   2   23
   1   25

Fold_right sum : 26

The two differences are:

With fold_left the function is applied from first entry to last entry while with fold_right the function is applied in reverse order.
With fold_left the initializer/previous result is passed as the first parameter to the applied function while with fold_right the initializer/previous result is passed as the second parameter.

Like iter and map, Ocaml's fold functions can reduce the number of points of possible error in a program. More importantly, for the code reader who understands and is comfortable with these techniques, reading and understanding code using these functions is quicker than reading and understanding the equvalient for loop.

Ocaml rocks!

Posted at: 20:43 | Category: CodeHacking/Ocaml | Permalink

Tue, 11 Jul 2006

Ocaml : Fold.

Categories

Syndication

Planets

Links

File Validators