Skip to main content
  1. Readings/
  2. Books/
  3. Real World OCaml: Functional Programming for the Masses/

Chapter 2: Variables and Functions

··2408 words·12 mins

Variables and Functions

Variables #

OCaml has a different take on variables than other programming languages in the sense that variables are all constants within their scope.

Additionally, we may shadow them in nested scopes and have access to the outer scopes when the nested scope is done with. A good mental model is a stack of names and every time we redefine it, we append to the stack and when we’re done, we pop from the stack (leaving access to the previous value for that shadowed name).

This is the same as what we’re familiar with in the elixir world.

(* Nested let-bindings are useful *)
let languages = "OCaml,Perl,C++,C";;
let dashed_languages =
  let languages = String.split languages ~on:',' in
  String.concat ~sep:"-" languages;;

(* especially useful to do some complex calcuations and have intermediary values that we can refer to:  *)
let area_of_ring inner_radius outer_radius =
  let pi = Float.pi in
  let area_of_circle r = pi *. r *. r in
  area_of_circle outer_radius -. area_of_circle inner_radius;;

Pattern Matching and Let #

Nothing fancy here, pattern matching is structural comparisons so we can have let-binding of names on LHS and pattern match to the right hand side like so:

Just careful that all cases for the pattern matching are considered, else use a match expression instead.

let (ints,strings) = List.unzip [(1,"one"); (2,"two"); (3,"three")];;

(* better to use match expression here: *)
let upcase_first_entry line =
  match String.split ~on:',' line with
  | [] -> assert false (* String.split returns at least one element, so we can assert this case *)
  | first :: rest -> String.concat ~sep:"," (String.uppercase first :: rest);;

Functions #

Essentially “first class citizens” because they’re as normal as any other object.

let AND fun: let-binding function parameters #

Think of the parameter of a function as a variable being bound to the value passed by the caller. This notion is useful for our final understanding of monadic styles.

The following are equivalent statements (almost) in how this name-binding works:

(fun x -> x + 1) 7;;
let x = 7 in x + 1;;

Multiargument Functions #

Some ways to write functions with multiple args:

let abs_diff x y = abs (x - y);; (* usual, curried by default *)
let abs_diff = (fun x -> (fun y -> abs(x - y)));; (* explicitly written as a curried function *)
let abs_diff (x, y) = abs (x - y);; (* alternatively written as a single tuple argument, can't be partially applied *)

SPECIALITY: Multi-arg functions are equivalent to their curried forms in OCaml, so we may partially apply them.

Recursive Functions #

Should separate OCaml’s looping constructs and use them for imperative style code. This is why recursive functions are idiomatic for building looping constructs recursively.

We have to explicitly use let rec to define recursive function in OCaml. The value of having the programmer explicitly define recursive functions:

  • good for mutually recursive definitions because those are harder for humans to reason with

    here’s an example of a mutually recursive example for didactic reasons:

      let rec is_even x =
        if x = 0 then true else is_odd (x - 1)
      and is_odd x =
        if x = 0 then false else is_even (x - 1);;
    
  • having non-recursive forms helps us create a new definition that extends and supersedes a recursive form by shadowing it.

Prefix and Infix Operators (and operator overloading) #

  • special set of identifiers: ~ ! $ % & * + - . / : < = > ? @ ^ |

    Here’s an extended set of grammar rules for it, search it properly when you need to override / implement such infix functions. There’s a bunch of precedence and associativity rules to account for as well.

    The first character of the operator symbol determines a bunch of things (operator precedence, left or right associative…). We can group up multiple characters into a single infix operator.

    Some interesting things about user-defined operators:

    • we can easily implement the pipe operator (see below)
    • our choice of operator matters because it would have implications such as the operator being left or right associative that may pose problems to us.
          let (|>) x f = f x;;  (* NOTE: left-associative operator *)
      
          (* example of using pipe operator in OCaml *)
          open Stdio;;
          let path = "/usr/bin:/usr/local/bin:/bin:/sbin:/usr/bin";;
          String.split ~on:':' path
          |> List.dedup_and_sort ~compare:String.compare
          |> List.iter ~f:print_endline;;
      
          (* NOTE: This works well because |> is left-associative, which is why the operator we use matters*)
      
          (* this won't work well because the operator is right-associative *)
          let (^>) x f = f x;;
      
    • operator precedence matters.
  • GOTCHA: using * identifier need to be careful because (**) is for comments syntax

    The fix to this is that we have to have some extra spaces:

      let ( *** ) x y = (x **. y) **. y;; (* this is correct *)
      let (***) x y = (x **. y) **. y;; (* this is NOT correct *)
    
  • GOTCHA: there are some special operators:

    • - : can be written as infix (e.g. Int.max (3-4) 1) and prefix (e.g. Int.max 3 (-4)) versions, with different meanings.

      Also, - has a lower precedence to function application in an expression, so that’s why we need brackets in the example above.

Reverse Application Operator (|>) #

This is mind blowing 🤯.

Since operator overloading is easy to do and the scope for it may be kept small, we can implement the Reverse Application operator (elixir pipe operator) easily.

let (|>) x f = f x;;  (* NOTE: left-associative operator *)

(* example of using pipe operator in OCaml *)
open Stdio;;
let path = "/usr/bin:/usr/local/bin:/bin:/sbin:/usr/bin";;
String.split ~on:':' path
|> List.dedup_and_sort ~compare:String.compare
|> List.iter ~f:print_endline;;

(* NOTE: This works well because |> is left-associative, which is why the operator we use matters*)

(* this won't work well because the operator is right-associative *)
let (^>) x f = f x;;

Application Operator (@@) #

For syntax sugar when we apply multiple functions together: f @@ g @@ h x which replaces f(g(h x)). This HAS to be right-associative.

more on syntax #

Naturally, there’s some good syntax support for writing functions.

function: syntactic support #

The function syntax allows us to pattern match easily, we can use it together with the usual syntax. If it’s a multi-argument function, we’re actually just pattern matching the last parameter to that function .

let some_or_default default = function
  | Some x -> x
  | None -> default;;

List.map ~f:(some_or_default 100) [Some 3; None; Some 4];;
(* another example:  *)
(** [contains t x] returns true iff [x] is contained in the
    interval [t] *)
let contains t x =
  match t with
  | Empty -> false
  | Interval (l,h) ->
     Endpoint.compare x l >= 0 && Endpoint.compare x h <= 0
(* converting this to use the syntactic sugar: *)
(** [contains t x] returns true iff [x] is contained in the interval [t] *)
let contains x = function
  | Empty -> false
  | Interval (l, h) ->
    Endpoint.compare x l >= 0 && Endpoint.compare x h <= 0

labelled args #

Labelling here is just named params.

The labelled args can be label pruned (aligns with field punning that we saw in records).

let ratio ~num ~denom = Float.of_int num /. Float.of_int denom;;

(* pruned names: *)
let num = 3 in
let denom = 4 in
ratio ~num ~denom;;
(* equivalent application of the ratio function: *)
ratio ~num:3 ~denom:10
  • Use cases

    To highlight the usefulness of labelling:

    1. explicating long argument lists

      when too many args, easier to remember by label rather than position

    2. when positional arg types are uninformative (and to disambiguate similar arguments)

      sometimes type signature is not sufficient to get the meaning of the arguments. Using labelled signature gives this clarity.

      e.g. val create_hashtable : int -> bool -> ('a,'b) Hashtable.t vs val create_hashtable : init_size:int -> allow_shrinking:bool -> ('a,'b) Hashtable.t

    3. it’s useful for partial application and chaining together operations

      It allows us to bind some of the arguments to partial functions based on their label. In the example below works because List.iter can be given the labelled argument for f.

         String.split ~on:':' path
         |> List.dedup_and_sort ~compare:String.compare
         |> List.iter ~f:Stdio.print_endline;;
      
  • GOTCHA: Higher Order Functions must be consistent in the ordering of labeled functions

    This one has to do with consistent ordering:

    • suppose HOF: let apply_to_tuple f (first,second) = f ~first ~second;;

      then if we use a labelled function like let divide ~first ~second = first / second;;, everything works well because the ordering of the arguments and the labels are consistent.

    • if the HOF was to swap the ordering: let apply_to_tuple_2 f (first,second) = f ~second ~first;; then the same divide function wouldn’t work.

      this is because the HOF expects a function and that function has a signature that the compiler ties down based on the names used there. In the case of the 2nd HOF, the function argument has a signature where the first argument it accepts is labelled as second and the second argument is labelled as first. This is where the mismatch is created if divide has a different ordering of the labelled arguments.

    • we can inspect the function signatures for our two functions and the problem is clearer:

      • apply_to_tuple has the signature - : (first:'a -> second:'b -> 'c) -> 'a * 'b -> 'c = <fun>

      • apply_to_tuple2 has the signature - : (second:'a -> first:'b -> 'c) -> 'b * 'a -> 'c = <fun>

Optional Arguments #

Good use cases for Optional Argos:

  1. The intent is to provide defaults for params that you may have an opinion on what they should be.
  2. for wrapping onto some API interface and providing some arguments already
       let concat ?(sep="") x y = x ^ sep ^ y;;
    
       (* here's how the wrapper can be built: *)
       let uppercase_concat ?sep a b = concat ?sep (String.uppercase a) b;; (* this is a pass-through*)
    
       (* negative example of how to write the wrapper: *)
       let uppercase_concat ?(sep="") a b = concat ~sep (String.uppercase a) b;;
       (* the decision on what the default separator should be is a distinct, separate decision on the outer function. If the inner function (concat) changes then we might need to change the outer function as well.*)
    
  3. RULE OF THUMB: avoid optional arguments for functions internal to a module (i.e. functions not defined in the interface)

Optional arguments really only make sense when the extra concision of omitting the argument outweighs the corresponding loss of explicitness.

Problems:

  1. user not knowing that there’s an optional argument
  2. rarely used functions should NOT have optional arguments

When defining a function with optional args, we have some syntactic sugar (where we just provide a default param to the optional) to help us out:

(* Longer way to write the function*)
let concat ?sep x y =
  let sep = match sep with None -> "" | Some s -> s in
  x ^ sep ^ y ;;

(* Terse way: no need match on optional, just can provide a default param*)
let concat ?(sep="") x y = x ^ sep ^ y;;

When passing in optional args, they have to be passed in explicitly:

how labelled and optional args are inferred #

  • explicit typing

    in this example:

    let numeric_deriv ~delta ~x ~y ~f =
      let x' = x +. delta in
      let y' = y +. delta in
      let base = f ~x ~y in
      let dx = (f ~x:x' ~y -. base) /. delta in
      let dy = (f ~x ~y:y' -. base) /. delta in
      (dx,dy);;
    (*
    the signature comes out to rely on the order of arguments that's in the code:
    
    val numeric_deriv :
      delta:float ->
      x:float -> y:float -> f:(x:float -> y:float -> float) -> float * float =
      <fun>
    
    *)
    

    not obvious how the order of the arguments to f should be chosen, but there’s a need to have some heuristic around resolving this ambiguity.

    The heuristic the compiler uses is to prefer labels to options and to choose the order of arguments that shows up in the source code.

    In cases that are ambiguous, compiler may throw an error for this, but we may still have varying orders if we explicitly provide the types:

    (* this works, types are explicitly provided *)
    let numeric_deriv ~delta ~x ~y ~(f: x:float -> y:float -> float) =
      let x' = x +. delta in
      let y' = y +. delta in
      let base = f ~x ~y in
      let dx = (f ~y ~x:x' -. base) /. delta in
      let dy = (f ~x ~y:y' -. base) /. delta in
      (dx,dy);;
    
    (* this won't work and will throw an error *)
    let numeric_deriv ~delta ~x ~y ~f =
      let x' = x +. delta in
      let y' = y +. delta in
      let base = f ~x ~y in
      let dx = (f ~y ~x:x' -. base) /. delta in
      let dy = (f ~x ~y:y' -. base) /. delta in
      (dx,dy);;
    (*
    Line 5, characters 15-16:
    Error: This function is applied to arguments
           in an order different from other calls.
           This is only allowed when the real type is known.
    *)
    
  • GOTCHA: partial application with optional arguments

    This part can be sometimes tricky.

    Somewhat consistent with our currying observations, we have to be aware the optional argument can be ERASED in some cases if we do partial applications.

    The rule is:

    • optional argument is erased as soon as the first positional (i.e., neither labeled nor optional) argument defined after the optional argument is passed in.

      say we do let colon_concat = concat ~sep: ":", then signature A and B will behave differently

      Signature A below will end up having the erasure B will still allow for it.

      This point only applies when we’re trying to do partial applications for functions that have optionals within them! If we provide all the arguments at once, then the ordering doesn’t matter whatsoever.

        (* signature A: positionals are all after the optional *)
        let concat ?(sep="") x y = x ^ sep ^ y;;
        (* val concat : ?sep:string -> string -> string -> string = <fun> *)
        let prepend_pound = concat "# ";;
        (* val prepend_pound : string -> string = <fun> *)
      
        (* Signature B: positional before optional *)
        let concat x ?(sep="") y = x ^ sep ^ y;;
        (* val concat : string -> ?sep:string -> string -> string = <fun> *)
        let prepend_pound = concat "# ";;
        (* val prepend_pound : ?sep:string -> string -> string = <fun> *)