- rtshkmr's digital garden/
- Readings/
- Books/
- Real World OCaml: Functional Programming for the Masses/
- Chapter 3: Lists and Patterns/
Chapter 3: Lists and Patterns
Table of Contents
Basic notes #
Just some points to revisit the tour that we already had
cons operator
::is right-associative. The attachment of things is consistent with the knowledge that OCaml lists are singly-linked lists.this is why the extension doesn’t modify the existing list, it’s more of that the new element is “prepended” and has a refernce to the rest of the list.
the empty list
[]is polymorphic even though the overall list is NOT.conventional naming:
- follow the naming of
hdandtlfor head and tail of list.
- follow the naming of
Patterns to extract data from list #
match expressions can let-bind new names #
Match expressions can be used to bind new variables. This may lead to problems where we think that we’re reusing a variable within the arm of a match case but it’s actually going to shadow that variable instead.
LIMITATION: In the elixir world, this is why we have the use of the pin operator but in OCaml, there’s no similar construct
(* example of how match expressions are used to bind new variables *)
let rec sum l =
match l with
| [] -> 0
| hd :: tl -> hd + sum tl;;
(* GOTCHA: here, it's not a reference but a new declaration, hence the older variable gets shadowed: *)
let rec drop_value l to_drop =
match l with
| [] -> []
| to_drop :: tl -> drop_value tl to_drop (* this is not a referene, it's a re-declaration of to_drop *)
| hd :: tl -> hd :: drop_value tl to_drop;; (* this case is the same as the 2nd case above.*)
the fix to this is to use guards or if statements, which would give the same function signature:
let rec drop_value l to_drop =
match l with
| [] -> []
| hd :: tl when hd = to_drop -> drop_value tl to_drop
| hd :: tl -> hd :: drop_value tl to_drop
;;
let rec drop_value l to_drop =
match l with
| [] -> []
| hd :: tl ->
let new_tl = drop_value tl to_drop in
if hd = to_drop then new_tl else hd :: new_tl;;
Limitations and Blessings of Pattern Matching #
Pattern matching is structural \(\implies\) we shouldn’t try to encode arbitrary conditions within the pattern. (That’s what the when conditions are for.)
The conditions we use pattern matching for should only be structurally relevant.
blessing: Performance #
matchcases are entries in a jump table, so they are efficient.a good comparison is between the performance for match cases and an equivalent if else statement:
WITNESS: here’s how we can do a quick benchmark using
core_benchmodule(* this gives a jump table. *) let plus_one_match x = match x with | 0 -> 1 | 1 -> 2 | 2 -> 3 | 3 -> 4 | 4 -> 5 | 5 -> 6 | _ -> x + 1;; (* this is sequential, gives a conditional branch tree to navigate *) let plus_one_if x = if x = 0 then 1 else if x = 1 then 2 else if x = 2 then 3 else if x = 3 then 4 else if x = 4 then 5 else if x = 5 then 6 else x + 1;; #require "core_bench";; open Core_bench;; [ Bench.Test.create ~name:"plus_one_match" (fun () -> plus_one_match 10) ; Bench.Test.create ~name:"plus_one_if" (fun () -> plus_one_if 10) ] |> Bench.bench;;Estimated testing time 20s (2 benchmarks x 10s). Change using '-quota'. ┌────────────────┬──────────┬────────────┐ │ Name │ Time/Run │ Percentage │ ├────────────────┼──────────┼────────────┤ │ plus_one_match │ 35.45ns │ 74.59% │ │ plus_one_if │ 47.53ns │ 100.00% │ └────────────────┴──────────┴────────────┘
blessing: Error Detection #
Some examples of what we’ve already seen
- the exhaustiveness of
matchexpression-cases is checked by the compiler - redundancies of cases are also checked
These work by patterned-context, not by some algo that checks the predicates (doesn’t exist).
What we will see soon:
- the exhaustiveness checks in the context of variants and user defined types
- the usefulness of using match statement warnings when we want to chase a chain of refactor-effects – the ball of yarn will just show itself!!
List module #
Similar to elixir, the default modules offer good APIs for us to use.
Something to note in this section is that we will see a bunch of composed funtions such as List.concat_map which composes map and concat. Like in our understanding of functions (which is right-associative) any composed function op x_y means apply y then x
Here’s an example set of code where we:
use
List.mapandList.map2_exn- the
_exnis the OCaml way of doing throwable function calls, like in Elixir where we have strict operations appended with!to the function name.
- the
use
StringfunctionsString.concatwhich are better for large strings. Using the^operator will generate multiple sub-strings first – becomes a problem for largers strings.String.make
use
List.foldwhich is just a more general version of reduce function in OCaml syntaxList.fold walks over the list from left to right, updating the accumulator at each step and returning the final value of the accumulator when it’s done.
(* example usage for summing a list *) List.fold ~init:0 ~f:(+) [1;2;3;4];;Note:
- accumulator need NOT be the same time as the elements of the list
List.fold ~init:[] ~f:(fun acc hd -> hd :: acc) [1;2;3;4];;
- accumulator need NOT be the same time as the elements of the list
The example is one where we try to render a table of values (i.e. a 2D array). So our table rendering function’s code would look something like:
| |
| language | architect | first release |
|----------+----------------+---------------|
| Lisp | John McCarthy | 1958 |
| C | Dennis Ritchie | 1969 |
| ML | Robin Milner | 1973 |
| OCaml | Xavier Leroy | 1996 |
Combining Elements with List.reduce #
It’s a more specific case, of fold, without the init param supplied.
Result is wrapped in an optional.
Filtering with List.filter, List.filter_map #
- the case for filtering then mapping is common enough for
List.filter_map(* list of file extensions from a list of files, returning deduped and in sorted order. *) let extensions filenames = List.filter_map filenames ~f:(fun fname -> match String.rsplit2 ~on:'.' fname with | None | Some ("",_) -> None | Some (_,ext) -> Some ext) |> List.dedup_and_sort ~compare:String.compare;; extensions ["foo.c"; "foo.ml"; "bar.ml"; "bar.mli"];
Partitioning with List.partition_tf #
The _tf is a helpful mnemonic that reminds us that the output is two lists: first list is with the elements where the predicate returns true and the second list is the one for falses.
Multiple ways to combine lists #
We can append (which can be done via List.append or operator @) or we can concat via List.concat (something more like flattening a list of lists).
Since concat and map functions are often used together (same idea as flatmaps), there’s a dedicated List.concat_map for us to use
#require "core_unix.sys_unix";;
module Sys = Core.Sys
module Filename = Core.Filename;;
(* example without concat_map *)
let rec ls_rec s =
if Sys_unix.is_file_exn ~follow_symlinks:true s
then [s]
else
Sys_unix.ls_dir s
|> List.map ~f:(fun sub -> ls_rec (Filename.concat s sub))
|> List.concat;; (*flattens*)
(* example with concat_map *)
let rec ls_rec s =
if Sys_unix.is_file_exn ~follow_symlinks:true s
then [s]
else
Sys_unix.ls_dir s
|> List.concat_map ~f:(fun sub -> ls_rec (Filename.concat s sub));;
val ls_rec : string -> string list = <fun>
Tail-call Optimisation: Writing tail recursive functions #
Classic stack machine used so there’s a limit to the stack frames, but it’s easy to write out tail recursive functions.
(* Not tail-recursive: recursive call wrapped inside addition *)
let rec length = function
| [] -> 0
| _ :: tl -> 1 + length tl (* not tail position *)
;;
(* Tail-recursive helper with accumulator `n` *)
let rec length_plus_n l n =
match l with
| [] -> n
| _ :: tl -> length_plus_n tl (n + 1) (* tail position *)
(* Wrapper function to hide accumulator *)
let length l = length_plus_n l 0
Tail-call optimisation is not new to me but here’s another short rundown:
in
length_plus_n, the recursive call for that function is a tail call \(\implies\) no need to allocate new stack frame- tail-call when the caller of the function doesn’t have to do anything withh the returned value (doesn’t use it) that is returned by the callee, other than to return it. This is why there’s no need to preserve that stack frame (of the caller) \(\implies\) compiler can make this tail call optimisation.
idioms (to transform non-TCO to TCO style):
tail calls = recursive call is the “final action”, no other important logic happens after
accumulator and helper usage:
it’s just a clean way of doing things so that we can isolate out non-tail calls from tail-calls within naive recursive functions \(\implies\) naturally we end up with tail-called helpers
if all calls are tail calls, then we’ve done tail-call optimisation correctly.
Terser / Faster Patterns #
using the as pattern to alias the matched pattern #
Typically if we add in name, a new object for the matching type is created. We can save space by using as to get handle to the matched value.
The as pattern lets you bind a name to the whole matched value in addition to deconstructing it.
let rec remove_sequential_duplicates list =
match list with
| [] | [_] as l -> l (* or-pattern to collapse the handling *)
| first :: (second :: _ as tl) -> (* see the matched object _ is what we have a direct handle to (instead of creating a copy.) *)
if first = second then
remove_sequential_duplicates tl
else
first :: remove_sequential_duplicates tl;;
(* SLOW, BAD EXAMPLE *)
let rec remove_sequential_duplicates list =
match list with
| [] -> []
| [x] -> [x]
| first :: second :: tl ->
if first = second then
remove_sequential_duplicates (second :: tl)
else
first :: remove_sequential_duplicates (second :: tl);;
using when for precondition checks #
continuing with the example, the similarity check was being done via an if-else. We could shift it to a when guard.
let rec remove_sequential_duplicates list =
match list with
| [] | [_] as l -> l
| first :: (second :: _ as tl) when first = second ->
remove_sequential_duplicates tl
| first :: tl -> first :: remove_sequential_duplicates tl;;
polymorphic comparison #
This is going to be so magical.
The idea here is that the Base’s default equality op is integers-only. If we use a polymorphic version, then we will be able to use that function we had for any type.
That’s what Base.Poly helps us do.
open Base.Poly
let rec remove_sequential_duplicates list =
match list with
| [] | [_] as l -> l
| first :: (second :: _ as tl) when first = second ->
remove_sequential_duplicates tl
| first :: tl -> first :: remove_sequential_duplicates tl;;
remove_sequential_duplicates ["one";"two";"two";"two";"three"];;
LIMITATION: you can’t build these functions (polymorphic comparators) on your own. OCaml’s polymorphic comparison functions are built into the runtime to a low level. These comparisons are polymorphic on the basis of ignoring almost everything about the types of the values that are being compared, paying attention only to the structure of the values as they’re laid out in memory.
WITNESS: typically OCaml devs don’t like polymorphic comparison because it can introduce silent bugs that will be hard to catch. Restrictions are a good thing because they give predictability to our code.
NOTE: when guards may harm the compiler’s exhaustiveness check because the pattern matching and logical predicate (for when clause) is being directly used.