Topic 4: Binary Search

Canonical Questions #

Key Intuition: we are able to exploit the property of monotonicity in the order of a sequence to our benefit. This monotonicty may be seen in various factors / domains: the input, the output, the intermediate search space, some sort of virtual domain.

Classic Search over Sorted Array #

Objective
- Efficiently find the target element or boundary positions in a sorted array.
In some instances we might be posed with a multi-dimensional input and we should attempt to flatten it in a way that works for us.

Problems #

First/Last Position of Element in Sorted Array (34)

This question asks us to implement a custom bisect_left and bisect_right. The key learning here is that the accuracy matters. Just go slow and be clear in two aspects: a) what the while condition’s predicate function should be and b) how the recursion / boundary adjustment happens.

It’s for this reason that the open-open range boundary via left and right pointers is usually the easier way to write this.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
   class Solution:
       def searchRange(self, nums: List[int], target: int) -> List[int]:
           """
           Objective is to implement bisect_left and bisect_right manually.

           In this solution, we shall use inclusive ranges for the helper functions, which affects how the rest of the recursive logic is implemented.
           """
           def search_left(left, right, target):
               """
               The bounds matter. We shall use left to right inclusive, which changes the while conditional.
               """
               res = -1
               while left <= right:
                   mid = left + (right - left) // 2
                   if nums[mid] >= target: # try searching to the left:
                       if nums[mid] == target:
                           res = mid
                       right = mid - 1
                   else:
                       left = mid + 1

               return res

           def search_right(left, right, target):
               """
               Left and right inclusive ranges, hence the while condition is not strict.
               """
               res = -1
               while left <= right:
                   mid = left + (right - left) // 2
                   if nums[mid] <= target: #  attempt to search right
                       if nums[mid] == target:
                           res = mid
                       left = mid + 1
                   else:
                       right = mid - 1
               return res

           n = len(nums)
           left_res = search_left(0, n - 1, target)
           if left_res == -1:
               return [-1,-1]

           right_res = search_right(0, n - 1, target)
           return [left_res, right_res]

Searching a 2D Matrix (74)

Key Idea is to search by the starting numbers to find the row and to do it again within the row. So dimension-by-dimension. This works in \(O(log m + log n)\) speed since it’s dimension by dimension. This should be the first-reach implementation.

For an improvement, better still, we can treat it as a flattened array. This works in \(O(log(m * n))\) speed

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
    class Solution:
        def searchMatrix(self, matrix: List[List[int]], target: int) -> bool:
            if not matrix or not matrix[0]:
                return False

            m, n = len(matrix), len(matrix[0])
            left, right = 0, ( m * n ) - 1

            while left <= right:
                mid = left + (right - left) // 2
                # TRICK: ⭐️ this is pretty!
                row, col = divmod(mid, n)
                val = matrix[row][col]
                if val == target:
                    return True
                elif val < target:
                    left = mid + 1
                else:
                    right = mid - 1
            return False

Search on Answer / Decision Space #

Objective
- Perform binary search over a numeric answer or parameter space with a monotonic feasibility predicate to find optimal solutions.

Problems #

Koko Eating Bananas (875)

We know that the eating speed is the search space and it’s monotonically increasing so that’s what we have to manually search through because we can’t use bisect function for this.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
    class Solution:
        def minEatingSpeed(self, piles: List[int], h: int) -> int:
            max_pile = max(piles)
            if h == len(piles):
                return max_pile

            eating_time = lambda rate: sum((-(-pile//rate)) for pile in piles)

            left, right = 1, max_pile # NOTE: inclusive ranges

            ans = right # will definitely reach this
            while left <= right:
                mid = left + ((right - left) // 2)
                time = eating_time(mid)
                if (time <= h): # fast enough but we wanna find slower rate
                    ans = mid # save this at least
                    right = mid - 1
                else:
                    left = mid + 1

            return ans

Capacity To Ship Packages Within D Days (1011)

Approach: leverages the binary search over the answer space (capacity) combined with a greedy check for feasibility

NOTE: need to adapt the binary search over the answer space (the ship capacity), but bisect works on sorted arrays, not on a computed range, hence the manually written:

Also to take note of the logical gotcha: there will be at least one trip needed.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
    class Solution:
        def shipWithinDays(self, weights: List[int], days: int) -> int:
            def get_days(capacity):
                num_trips = 1 # GOTCHA (logical): at least one trip needed
                curr_load = 0
                for w in weights:
                    if curr_load + w > capacity:
                        num_trips += 1
                        curr_load = w
                    else:
                        curr_load += w

                return num_trips

            low = min_capacity = max(weights) # else we can't even bring this package over
            high = sum(weights) # then we can just take it all in one go

            ans = high
            # inclusive ranges:
            while low <= high:
                mid = low + (high - low) // 2
                if get_days(mid) <= days:
                    # restrict the search on a smaller space:
                    high = mid - 1
                    ans = mid
                else:
                    low = mid + 1

            return ans

Minimum Time to Activate String (3639)

We search on the answer space and for the predicate we do a complement based, arithmetic calculation that we shall use a complement approach to count. The premise automatically makes us think of this complementary counting approach.

To get the number of valid substrings, we can take the total number of substrings (arithmetic) and then subtract away the total number of invalid substrings (for which we sliding window it and use arithmetic).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
    from typing import List

    class Solution:
        def minTime(self, s: str, order: List[int], k: int) -> int:
            n = len(s)

            def is_active(t):
                """
                To get the number of valid substrings, we can take the total number of substrings (arithmetic) and then subtract away the total number of invalid substrings (forwhich we sliding window it and use arithmetic).
                """
                stars = set(order[:t + 1])
                total_substrings = n * (n + 1) // 2 # nCr

                count_no_star = 0
                length = 0
                for i in range(n):
                    if i not in stars:
                        length += 1
                    else:
                        count_no_star += length * (length + 1) // 2
                        length = 0 # we start again, hence length == 0
                count_no_star += length * (length + 1) // 2

                return (total_substrings - count_no_star) >= k

            left, right = 0, n
            min_t = -1
            while left <= right:
                mid = left + (right - left) // 2
                if is_active(mid):
                    min_t = mid
                    right = mid - 1
                else:
                    left = mid + 1

            return min_t

Finding Min/Max Satisfying Conditions \(\implies\) usually an Optimisation Problem #

Objective
- Find the minimum or maximum value that satisfies a given monotonic condition, often optimization problems.

Problems #

⭐️⭐️ Searching in Implicit / Virtual Domains #

Objective
- Perform binary search over conceptual or implicit domains not explicitly stored but probed by queries or functions.

Problems #

Median of Two Sorted Arrays (4) (conceptual virtual search)

Median is less about the value, more about the relative ordering of the elements (position-based). This question explores our ability to handle such virtual spaces.

The first observation that helps us that the middle of the two arrays is where we will eventually find our global median. This means we can use that as boundaries and keep searching for the optimal “partition”.

We do this test considering the immediate left and immediate right values of both pointers (in the respective arrays).

This is a lot more intuitive after looking a the solution but is likely something that is hard to see unless we break it down to its primitives.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    class Solution:
        def findMedianSortedArrays(self, nums1, nums2):
            # Ensure nums1 is the smaller array
            if len(nums1) > len(nums2):
                nums1, nums2 = nums2, nums1

            m, n = len(nums1), len(nums2)
            total = m + n
            half = total // 2

            # we know that the region between the median of nums 1 and median of nums 2 is where it will lie, this is the virtual search space.
            left, right = 0, m # pointers within nums1
            while True:
                i = (left + right) // 2  # Partition within nums1 -- always
                j = half - i             # Partition within nums2 (because len(nums2) >= len(nums1))
                nums_i_left, nums_i_right = nums1[i - 1] if i > 0 else float('-inf'), nums1[i] if i < m else float('inf')
                nums_j_left, nums_j_right = nums2[j - 1] if j > 0 else float('-inf'), nums2[j] if j < n else float('inf')

                # case 1: found the right partition:
                if nums_i_left <= nums_j_right and nums_j_left <= nums_i_right:
                    # case 1A: if even number, then calculate the median using two middle numbers:
                    if total % 2 == 0:
                        return (max(nums_i_left, nums_j_left) + min(nums_i_right, nums_j_right)) / 2
                    else:
                        return min(nums_i_right, nums_j_right)

                elif nums_i_left > nums_j_right: # recurse "left"
                    right = i - 1
                else:
                    left = i + 1

Lower Bound / UpperBound Variants #

Objective
- Find first or last positions in sorted structures (or insertion points) meeting criteria, handling duplicates or boundaries precisely.

Problems #

Searching in Rotated / Modified Arrays / Circular Structures #

Objective
- Adapt binary search to handle arrays that are rotated, partially sorted, or circularly structured.

Problems #

Search in Rotated Sorted Array (33)

Idea here is to realise that we should be finding out which segment is sorted, then making decisions the right on where to recurse.

It’s not that different from the usual, manual binary search implementation.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
    class Solution:
        def search(self, nums: List[int], target: int) -> int:
            """
            We search almost similarly, we just need to identify which segment is the sorted segment and which is the pivoted segment.
            """
            low, high = 0, len(nums) - 1
            while low <= high:
                mid = low + (high - low) // 2
                found_target = nums[mid] == target
                if found_target:
                    return mid

                # determine sorted vs pivoted side:
                if nums[low] <= nums[mid]:
                    # within left sement
                    if nums[low] <= target < nums[mid]:
                        high = mid - 1 # recurse left
                    else:
                        low = mid + 1 # recurse right
                else: # is right segment sorted
                    if nums[mid] < target <= nums[high]:  # is target in the right side:
                        low = mid + 1 # recurse right
                    else:
                        high = mid - 1 # recurse left

            return -1

Find Minimum in Rotated Sorted Array (153)

A rotated array is 2 sorted subarrays, with the minimum being a left-boundary search / pivot point search. Our choice of recursing into which subarray is based on the current window we are looking at so if nums[mid] > nums[right] then the min is to the right of mid and if nums[mid] < nums[right] then the min is at mid or to the left of mid.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
    class Solution:
        def findMin(self, nums: List[int]) -> int:
            left, right = 0, len(nums) - 1
            while left < right:
                mid = left + (right - left) // 2
                # then the min must be in the right segment
                if nums[mid] > nums[right]:
                    left = mid + 1
                # min must be in the left segment
                else:
                    right = mid
            return nums[left]

⭐️ Find minimum in Rotated Sorted Array II (154)

The key idea is that same as usual binary search: which segment do we confidently recurse into? Turns out that we can compare boundaries with the mid: if right is smaller than mid then the min value must be @ right segment, if right is larger than mid, then the min value must come from the left segment and if it’s equal, then we have to pop out one of the duplicates and carry on.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
   class Solution:
       def findMin(self, nums: List[int]) -> int:
           left, right = 0, len(nums) - 1
           ans = float('inf')
           while left < right:
               mid = left + (right - left) // 2
               # can recurse left because left segment is smaller, right is sorted
               if nums[mid] < nums[right]:
                   right = mid
                   # can recurse right because right segment is smaller, min is in the right half, excluding idx = mid
               elif nums[right] < nums[mid]:
                   left = mid + 1
               else: # they are equal
                   # left += 1 # NOTE: can't just remove both, it's going to remove info
                   right -= 1

           return nums[left]

NOTE: the left, right define inclusive ranges above. The while condition having strict inequality is so that we avoid infinite loops (by ensuring that the interval size decreases in each iteration). This is necessary because we may have duplicates which lead to nums[left] = nums[right]=. This means that the while loop terminates when left = right=

Time Based Key Value Store (981)

We just have to keep sorted chained dictionaries (to act as a multiset), so we find the value from within the chain.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
   from collections import defaultdict
   from bisect import bisect_right

   class TimeMap:
       def __init__(self):
           self.key_to_vals = defaultdict(list)

       def set(self, key: str, value: str, timestamp: int) -> None:
           self.key_to_vals[key].append((timestamp, value))

       def get(self, key: str, timestamp: int) -> str:
           vals = self.key_to_vals[key]
           if not vals:
               return ""

           idx = bisect_right(vals, timestamp, key=lambda x: x[0])
           if idx == 0:
               return ""

           return vals[idx-1][1]

⭐️ Bitwise / Digit-level Binary Search #

Objective
- Binary search over bits, digits, or sub-ranges within elements to find values or bounds at more granular levels.

Problems #

Single Element in a Sorted Array (540)

The requirement is to do it in \(O(log n)\) time, so the bit (XOR) hack won’t work, but might as well just see it.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52

   # M1: my approach ( doing a scaled index access )
   class Solution:
       def singleNonDuplicate(self, nums: List[int]) -> int:
           n = len(nums)
           uniq = (n + 1) // 2
           left, right = 0, uniq - 1

           while left <= right:
               middle = left + (right - left) // 2
               middle_idx = middle * 2

               # Edge case: if middle_idx is the last index, it must be the answer
               if middle_idx == n - 1:
                   return nums[middle_idx]

               # Safely check neighbors to identify singular element
               left_neighbor_differs = (middle_idx == 0) or (nums[middle_idx] != nums[middle_idx - 1])
               right_neighbor_differs = (middle_idx == n - 1) or (nums[middle_idx] != nums[middle_idx + 1])

               # If both neighbors differ, found the singular element
               if left_neighbor_differs and right_neighbor_differs:
                   return nums[middle_idx]

               # If current pair is broken (does not match next), search left
               if nums[middle_idx] != nums[middle_idx + 1]:
                   right = middle
               else:
                   left = middle + 1

   # M2: parity based binary search:
   class Solution:
       def singleNonDuplicate(self, nums: List[int]) -> int:
           lo, hi = 0, len(nums) - 1
           while lo < hi:
               mid = (lo + hi) // 2
               # Ensure mid is even for easier pairing
               if mid % 2 == 1:
                   mid -= 1
               if nums[mid] == nums[mid + 1]:
                   lo = mid + 2
               else:
                   hi = mid
           return nums[lo]

   # M3: xor trick ( but not acceptable )
   class Solution:
       def singleNonDuplicate(self, nums):
           result = 0
           for num in nums:
               result = result ^ num  # XOR accumulates all numbers
           return result

Find the Duplicate Number (287)

Technically there’s the binary search method that exploits the Pigeonhole Principle, but this is inferior because it’s slow:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
   class Solution:
       def findDuplicate(self, nums: List[int]) -> int:
           left, right = 1, len(nums) - 1
           while left < right:
               mid = (left + right) // 2
               count = sum(num <= mid for num in nums)
               if count > mid:
                   right = mid
               else:
                   left = mid + 1
           return left
   # _Why does the binary search work?_
   # The "pigeonhole principle": more numbers ≤ mid than mid itself means the duplicate is in that range.

The optimal solution:

Treat the array as a linked list (by going to \(nums[i]\) as the next pointer because of the question’s premise), use Floyd’s Tortoise and Hare for Cycle Detection then Duplicate Detection: move the fast pointer until they meet. Then to find entrance, move the slow pointer and the other pointer from the start until they meet again.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
   class Solution:
       def findDuplicate(self, nums: List[int]) -> int:
           # Phase 1: Find intersection
           slow = fast = nums[0]
           while True:
               slow = nums[slow]
               fast = nums[nums[fast]] # double jump
               if slow == fast:
                   break

           # Phase 2: Find entrance to the cycle
           fast = nums[0] # reset to beginning
           while slow != fast:
               slow = nums[slow]
               fast = nums[fast]
           return slow

Binary Search with Custom Predicates / Conditions #

Objective
- Use customized predicate functions defining feasibility or constraints to guide the search for answers in monotonic settings.

TODO Problems #

Aggressive Cows writeup in leeetcode community (placing cows with minimum distance) seems like it’s actually Magnetic Force Between Two Balls (1552)
Painter Partition Problem

SOE #

when we write out the bisect functions manually, we need to realise that the bounds matter (is it open-close or open-open)?
this affects:
1. the while condition’s predicate function
2. how we do the ‘recursion’ / adjustment of left and right boundaries.
Infinite loop (mid calculation)
- this happens when we don’t maintain the invariant that the search space constricts on every iteration of the while loop.
Wrong boundary updates (low/high)
Returning the wrong type when returning on failure cases (e.g. returning -1 when asked to return "")

Decision Flow #

Sorted input? → Classic binary search
Monotonic predicate? → Binary search on answer
Is problem about first/last occurrence? → Use lower/upper bound variants.
Is problem about optimization or feasibility? → Consider Binary search on answer or parameter space.

Formalised Algos #

Styles, Recipes, Boilerplate #

Basic Boilerplates #

precursor handwritten framework and the bisect versions

NOTE: this boiler plate is pure python that doesn’t use the stdlib values. This is just for cases where we are asked to implement it by hand

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
  def binary_search(nums: List[int], target: int) -> int:
      # set left and right indexes (inclusive indices shown here)
      left, right = 0, len(nums) - 1
      while left <= right: # NOTE: the condition here is for an overrun because they are inclusive indices
          mid = left + (right - left) // 2 # NOTE: this avoids overflows

          if nums[mid] < target:
              left = mid + 1
          elif nums[mid] > target:
              right = mid - 1
          elif nums[mid] == target:
              # found the target value
              return mid
      # target value not found
      return -1

  def left_bound(nums: List[int], target: int) -> int:
      """
      This searches for a left boundary and it also handles the case where we don't have the target in the array.
      """
      # set left and right indexes, both are inclusive indices
      left, right = 0, len(nums) - 1
      while left <= right:
          mid = left + (right - left) // 2
          if nums[mid] < target:
              left = mid + 1
          elif nums[mid] > target:
              right = mid - 1 # because inclusive indices
          elif nums[mid] == target:
              # target exists, narrow the right boundary (since we are searching for the left boundary)
              right = mid - 1
      # determine if the target exists, these are non existing cases:
      if left < 0 or left >= len(nums):
          return -1
      # determine if the left boundary found is the target value
      return left if nums[left] == target else -1

  def right_bound(nums: List[int], target: int) -> int:
      # set left and right indexes
      left, right = 0, len(nums) - 1
      while left <= right:
          mid = left + (right - left) // 2
          if nums[mid] < target:
              left = mid + 1
          elif nums[mid] > target:
              right = mid - 1
          elif nums[mid] == target:
              # target exists, narrow the left boundary because we're still searching for the right boundary
              left = mid + 1
      # determine if the target exists -- these are the out-of-bounds cases to handle.
      if right < 0 or right >= len(nums):
          return -1
      # determine if the right boundary found is the target value
      return right if nums[right] == target else -1

Alternatively, the same 3 things can be done using python’s bisect module:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
  import bisect

  def binary_search(nums: List[int], target: int) -> int:
      # bisect_left to find leftmost occurrence of target
      pos = bisect.bisect_left(nums, target)
      # Verify pos is within range and equals target
      if pos != len(nums) and nums[pos] == target:
          return pos
      return -1

  def left_bound(nums: List[int], target: int) -> int:
      # bisect_left will give leftmost insertion point
      pos = bisect.bisect_left(nums, target)
      if pos != len(nums) and nums[pos] == target:
          return pos
      return -1

  def right_bound(nums: List[int], target: int) -> int:
      # bisect_right will give rightmost insertion point, so right boundary is pos - 1
      pos = bisect.bisect_right(nums, target)
      if pos > 0 and nums[pos - 1] == target:
          return pos - 1
      return -1

Recipes #

Math Recipes on Python: Doing division:
- integer division / floor division: a // b returns an int
- ceiling division: -(-a//b)
  we double negate it, first to carry out a floor division on the negated form, then to negate it back
- normal division: a / b returns a float, preserving the accuracy

TODO KIV #

[ ] Tough canonical types that haven’t been touched on: #

TODO Pending canonical questions: #

finding min/max satisfying conditions this entire category hasn’t been touched before.
1. TODO Split Array Largest Sum (410)
2. TODO Minimum Speed to Arrive on Time (1870)
Lower bound / upperbound variants This category of lower bound / upper bound variants has NOT been addressed yet.
1. TODO Count of Smaller Numbers After Self (315)
Binary Search with Custom Predicates / Conditions:: Aggressive Cows writeup in leeetcode community (placing cows with minimum distance) seems like it’s actually Magnetic Force Between Two Balls (1552)
This canonical type doesn’t have anything that was done prior to this.
Difficult canonicals:
1. More questions under the searching in implicit / virtual domains
2. More questions under the Bitwise Binary Search

More canonicals to explore #

Search in Infinite or Unknown Length Arrays
- Requires boundary doubling or exponential probing before classic binary search.
- Example: Search in a Sorted Array of Unknown Size (Leetcode 702).
Peak Finding (1D and 2D)
- Find a local maximum in an array/matrix; can use binary search due to monotonic property in neighbors.
- Example: Find Peak Element (Leetcode 162), Find a Peak Element II (Leetcode 1901).
Search in Sparse or Special-Format Arrays
- Adaptation for arrays with lots of nulls/blanks, or unknown spacing between entries.
- Example: Sparse Search (Cracking the Coding Interview, not a standard LeetCode problem).
Square Root, Cube Root Approximations
- Binary search for finding floor/ceiling of root values (for large inputs).
- Example: Sqrt(x) (Leetcode 69).
Binary Search on Continuous Real Intervals
- For precision or geometry problems, apply classic search to floating-point intervals.
- Example: Kth Smallest Prime Fraction (Leetcode 786).
Minimum/Maximum Capacity Without Sorting
- Push the bounds further for ship, chamber, core, etc., capacities required under constraints.
- Example: Minimum Number of Days to Make m Bouquets (Leetcode 1482), Minimum Limit of Balls in a Bag (Leetcode 1760).
Binary Search on Trees/Graphs
- Typical when binary search is used in conjunction with tree structures.
- Example: Closest Binary Search Tree Value II (Leetcode 272).
“K-th Element” by Count
- Binary search the answer, counting by predicate or via other methods.
- Example: Kth Smallest Element in a Sorted Matrix (Leetcode 378).
Median of Stream/Sliding Window
- Median finding with binary search combined with sliding windows and data structures.
- Example: Sliding Window Median (Leetcode 480).