Topic 4: Binary Search
Table of Contents
Canonical Questions #
- Key Intuition: we are able to exploit the property of monotonicity in the order of a sequence to our benefit. This monotonicty may be seen in various factors / domains: the input, the output, the intermediate search space, some sort of virtual domain.
Classic Search over Sorted Array #
- Objective
- Efficiently find the target element or boundary positions in a sorted array.
- In some instances we might be posed with a multi-dimensional input and we should attempt to flatten it in a way that works for us.
Problems #
First/Last Position of Element in Sorted Array (34)
This question asks us to implement a custom
bisect_leftandbisect_right. The key learning here is that the accuracy matters. Just go slow and be clear in two aspects: a) what the while condition’s predicate function should be and b) how the recursion / boundary adjustment happens.It’s for this reason that the open-open range boundary via left and right pointers is usually the easier way to write this.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45class Solution: def searchRange(self, nums: List[int], target: int) -> List[int]: """ Objective is to implement bisect_left and bisect_right manually. In this solution, we shall use inclusive ranges for the helper functions, which affects how the rest of the recursive logic is implemented. """ def search_left(left, right, target): """ The bounds matter. We shall use left to right inclusive, which changes the while conditional. """ res = -1 while left <= right: mid = left + (right - left) // 2 if nums[mid] >= target: # try searching to the left: if nums[mid] == target: res = mid right = mid - 1 else: left = mid + 1 return res def search_right(left, right, target): """ Left and right inclusive ranges, hence the while condition is not strict. """ res = -1 while left <= right: mid = left + (right - left) // 2 if nums[mid] <= target: # attempt to search right if nums[mid] == target: res = mid left = mid + 1 else: right = mid - 1 return res n = len(nums) left_res = search_left(0, n - 1, target) if left_res == -1: return [-1,-1] right_res = search_right(0, n - 1, target) return [left_res, right_res]Key Idea is to search by the starting numbers to find the row and to do it again within the row. So dimension-by-dimension. This works in \(O(log m + log n)\) speed since it’s dimension by dimension. This should be the first-reach implementation.
For an improvement, better still, we can treat it as a flattened array. This works in \(O(log(m * n))\) speed
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20class Solution: def searchMatrix(self, matrix: List[List[int]], target: int) -> bool: if not matrix or not matrix[0]: return False m, n = len(matrix), len(matrix[0]) left, right = 0, ( m * n ) - 1 while left <= right: mid = left + (right - left) // 2 # TRICK: ⭐️ this is pretty! row, col = divmod(mid, n) val = matrix[row][col] if val == target: return True elif val < target: left = mid + 1 else: right = mid - 1 return False
Search on Answer / Decision Space #
- Objective
- Perform binary search over a numeric answer or parameter space with a monotonic feasibility predicate to find optimal solutions.
Problems #
We know that the eating speed is the search space and it’s monotonically increasing so that’s what we have to manually search through because we can’t use bisect function for this.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21class Solution: def minEatingSpeed(self, piles: List[int], h: int) -> int: max_pile = max(piles) if h == len(piles): return max_pile eating_time = lambda rate: sum((-(-pile//rate)) for pile in piles) left, right = 1, max_pile # NOTE: inclusive ranges ans = right # will definitely reach this while left <= right: mid = left + ((right - left) // 2) time = eating_time(mid) if (time <= h): # fast enough but we wanna find slower rate ans = mid # save this at least right = mid - 1 else: left = mid + 1 return ansCapacity To Ship Packages Within D Days (1011)
Approach: leverages the binary search over the answer space (capacity) combined with a greedy check for feasibility
NOTE: need to adapt the binary search over the answer space (the ship capacity), but bisect works on sorted arrays, not on a computed range, hence the manually written:
Also to take note of the logical gotcha: there will be at least one trip needed.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29class Solution: def shipWithinDays(self, weights: List[int], days: int) -> int: def get_days(capacity): num_trips = 1 # GOTCHA (logical): at least one trip needed curr_load = 0 for w in weights: if curr_load + w > capacity: num_trips += 1 curr_load = w else: curr_load += w return num_trips low = min_capacity = max(weights) # else we can't even bring this package over high = sum(weights) # then we can just take it all in one go ans = high # inclusive ranges: while low <= high: mid = low + (high - low) // 2 if get_days(mid) <= days: # restrict the search on a smaller space: high = mid - 1 ans = mid else: low = mid + 1 return ansMinimum Time to Activate String (3639)
We search on the answer space and for the predicate we do a complement based, arithmetic calculation that we shall use a complement approach to count. The premise automatically makes us think of this complementary counting approach.
To get the number of valid substrings, we can take the total number of substrings (arithmetic) and then subtract away the total number of invalid substrings (for which we sliding window it and use arithmetic).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36from typing import List class Solution: def minTime(self, s: str, order: List[int], k: int) -> int: n = len(s) def is_active(t): """ To get the number of valid substrings, we can take the total number of substrings (arithmetic) and then subtract away the total number of invalid substrings (forwhich we sliding window it and use arithmetic). """ stars = set(order[:t + 1]) total_substrings = n * (n + 1) // 2 # nCr count_no_star = 0 length = 0 for i in range(n): if i not in stars: length += 1 else: count_no_star += length * (length + 1) // 2 length = 0 # we start again, hence length == 0 count_no_star += length * (length + 1) // 2 return (total_substrings - count_no_star) >= k left, right = 0, n min_t = -1 while left <= right: mid = left + (right - left) // 2 if is_active(mid): min_t = mid right = mid - 1 else: left = mid + 1 return min_t
Finding Min/Max Satisfying Conditions \(\implies\) usually an Optimisation Problem #
- Objective
- Find the minimum or maximum value that satisfies a given monotonic condition, often optimization problems.
Problems #
⭐️⭐️ Searching in Implicit / Virtual Domains #
- Objective
- Perform binary search over conceptual or implicit domains not explicitly stored but probed by queries or functions.
Problems #
Median of Two Sorted Arrays (4) (conceptual virtual search)
Median is less about the value, more about the relative ordering of the elements (position-based). This question explores our ability to handle such virtual spaces.
The first observation that helps us that the middle of the two arrays is where we will eventually find our global median. This means we can use that as boundaries and keep searching for the optimal “partition”.
We do this test considering the immediate left and immediate right values of both pointers (in the respective arrays).
This is a lot more intuitive after looking a the solution but is likely something that is hard to see unless we break it down to its primitives.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30class Solution: def findMedianSortedArrays(self, nums1, nums2): # Ensure nums1 is the smaller array if len(nums1) > len(nums2): nums1, nums2 = nums2, nums1 m, n = len(nums1), len(nums2) total = m + n half = total // 2 # we know that the region between the median of nums 1 and median of nums 2 is where it will lie, this is the virtual search space. left, right = 0, m # pointers within nums1 while True: i = (left + right) // 2 # Partition within nums1 -- always j = half - i # Partition within nums2 (because len(nums2) >= len(nums1)) nums_i_left, nums_i_right = nums1[i - 1] if i > 0 else float('-inf'), nums1[i] if i < m else float('inf') nums_j_left, nums_j_right = nums2[j - 1] if j > 0 else float('-inf'), nums2[j] if j < n else float('inf') # case 1: found the right partition: if nums_i_left <= nums_j_right and nums_j_left <= nums_i_right: # case 1A: if even number, then calculate the median using two middle numbers: if total % 2 == 0: return (max(nums_i_left, nums_j_left) + min(nums_i_right, nums_j_right)) / 2 else: return min(nums_i_right, nums_j_right) elif nums_i_left > nums_j_right: # recurse "left" right = i - 1 else: left = i + 1
Lower Bound / UpperBound Variants #
- Objective
- Find first or last positions in sorted structures (or insertion points) meeting criteria, handling duplicates or boundaries precisely.
Problems #
Searching in Rotated / Modified Arrays / Circular Structures #
- Objective
- Adapt binary search to handle arrays that are rotated, partially sorted, or circularly structured.
Problems #
Search in Rotated Sorted Array (33)
Idea here is to realise that we should be finding out which segment is sorted, then making decisions the right on where to recurse.
It’s not that different from the usual, manual binary search implementation.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26class Solution: def search(self, nums: List[int], target: int) -> int: """ We search almost similarly, we just need to identify which segment is the sorted segment and which is the pivoted segment. """ low, high = 0, len(nums) - 1 while low <= high: mid = low + (high - low) // 2 found_target = nums[mid] == target if found_target: return mid # determine sorted vs pivoted side: if nums[low] <= nums[mid]: # within left sement if nums[low] <= target < nums[mid]: high = mid - 1 # recurse left else: low = mid + 1 # recurse right else: # is right segment sorted if nums[mid] < target <= nums[high]: # is target in the right side: low = mid + 1 # recurse right else: high = mid - 1 # recurse left return -1Find Minimum in Rotated Sorted Array (153)
A rotated array is 2 sorted subarrays, with the minimum being a left-boundary search / pivot point search. Our choice of recursing into which subarray is based on the current window we are looking at so if
nums[mid] > nums[right]then the min is to the right of mid and ifnums[mid] < nums[right]then the min is atmidor to the left ofmid.1 2 3 4 5 6 7 8 9 10 11 12class Solution: def findMin(self, nums: List[int]) -> int: left, right = 0, len(nums) - 1 while left < right: mid = left + (right - left) // 2 # then the min must be in the right segment if nums[mid] > nums[right]: left = mid + 1 # min must be in the left segment else: right = mid return nums[left]⭐️ Find minimum in Rotated Sorted Array II (154)
The key idea is that same as usual binary search: which segment do we confidently recurse into? Turns out that we can compare boundaries with the mid: if right is smaller than mid then the min value must be @ right segment, if right is larger than mid, then the min value must come from the left segment and if it’s equal, then we have to pop out one of the duplicates and carry on.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17class Solution: def findMin(self, nums: List[int]) -> int: left, right = 0, len(nums) - 1 ans = float('inf') while left < right: mid = left + (right - left) // 2 # can recurse left because left segment is smaller, right is sorted if nums[mid] < nums[right]: right = mid # can recurse right because right segment is smaller, min is in the right half, excluding idx = mid elif nums[right] < nums[mid]: left = mid + 1 else: # they are equal # left += 1 # NOTE: can't just remove both, it's going to remove info right -= 1 return nums[left]NOTE: the
left,rightdefine inclusive ranges above. The while condition having strict inequality is so that we avoid infinite loops (by ensuring that the interval size decreases in each iteration). This is necessary because we may have duplicates which lead tonums[left] =nums[right]=. This means that the while loop terminates whenleft =right=Time Based Key Value Store (981)
We just have to keep sorted chained dictionaries (to act as a multiset), so we find the value from within the chain.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20from collections import defaultdict from bisect import bisect_right class TimeMap: def __init__(self): self.key_to_vals = defaultdict(list) def set(self, key: str, value: str, timestamp: int) -> None: self.key_to_vals[key].append((timestamp, value)) def get(self, key: str, timestamp: int) -> str: vals = self.key_to_vals[key] if not vals: return "" idx = bisect_right(vals, timestamp, key=lambda x: x[0]) if idx == 0: return "" return vals[idx-1][1]
⭐️ Bitwise / Digit-level Binary Search #
- Objective
- Binary search over bits, digits, or sub-ranges within elements to find values or bounds at more granular levels.
Problems #
Single Element in a Sorted Array (540)
The requirement is to do it in \(O(log n)\) time, so the bit (XOR) hack won’t work, but might as well just see it.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52# M1: my approach ( doing a scaled index access ) class Solution: def singleNonDuplicate(self, nums: List[int]) -> int: n = len(nums) uniq = (n + 1) // 2 left, right = 0, uniq - 1 while left <= right: middle = left + (right - left) // 2 middle_idx = middle * 2 # Edge case: if middle_idx is the last index, it must be the answer if middle_idx == n - 1: return nums[middle_idx] # Safely check neighbors to identify singular element left_neighbor_differs = (middle_idx == 0) or (nums[middle_idx] != nums[middle_idx - 1]) right_neighbor_differs = (middle_idx == n - 1) or (nums[middle_idx] != nums[middle_idx + 1]) # If both neighbors differ, found the singular element if left_neighbor_differs and right_neighbor_differs: return nums[middle_idx] # If current pair is broken (does not match next), search left if nums[middle_idx] != nums[middle_idx + 1]: right = middle else: left = middle + 1 # M2: parity based binary search: class Solution: def singleNonDuplicate(self, nums: List[int]) -> int: lo, hi = 0, len(nums) - 1 while lo < hi: mid = (lo + hi) // 2 # Ensure mid is even for easier pairing if mid % 2 == 1: mid -= 1 if nums[mid] == nums[mid + 1]: lo = mid + 2 else: hi = mid return nums[lo] # M3: xor trick ( but not acceptable ) class Solution: def singleNonDuplicate(self, nums): result = 0 for num in nums: result = result ^ num # XOR accumulates all numbers return resultFind the Duplicate Number (287)
Technically there’s the binary search method that exploits the Pigeonhole Principle, but this is inferior because it’s slow:
1 2 3 4 5 6 7 8 9 10 11 12 13class Solution: def findDuplicate(self, nums: List[int]) -> int: left, right = 1, len(nums) - 1 while left < right: mid = (left + right) // 2 count = sum(num <= mid for num in nums) if count > mid: right = mid else: left = mid + 1 return left # _Why does the binary search work?_ # The "pigeonhole principle": more numbers ≤ mid than mid itself means the duplicate is in that range.The optimal solution:
Treat the array as a linked list (by going to \(nums[i]\) as the next pointer because of the question’s premise), use Floyd’s Tortoise and Hare for Cycle Detection then Duplicate Detection: move the fast pointer until they meet. Then to find entrance, move the slow pointer and the other pointer from the start until they meet again.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16class Solution: def findDuplicate(self, nums: List[int]) -> int: # Phase 1: Find intersection slow = fast = nums[0] while True: slow = nums[slow] fast = nums[nums[fast]] # double jump if slow == fast: break # Phase 2: Find entrance to the cycle fast = nums[0] # reset to beginning while slow != fast: slow = nums[slow] fast = nums[fast] return slow
Binary Search with Custom Predicates / Conditions #
- Objective
- Use customized predicate functions defining feasibility or constraints to guide the search for answers in monotonic settings.
TODO Problems #
Aggressive Cows writeup in leeetcode community (placing cows with minimum distance) seems like it’s actually Magnetic Force Between Two Balls (1552)
Painter Partition Problem
SOE #
when we write out the bisect functions manually, we need to realise that the bounds matter (is it open-close or open-open)?
this affects:
- the while condition’s predicate function
- how we do the ‘recursion’ / adjustment of left and right boundaries.
Infinite loop (mid calculation)
- this happens when we don’t maintain the invariant that the search space constricts on every iteration of the while loop.
Wrong boundary updates (low/high)
Returning the wrong type when returning on failure cases (e.g. returning
-1when asked to return"")
Decision Flow #
- Sorted input? → Classic binary search
- Monotonic predicate? → Binary search on answer
- Is problem about first/last occurrence? → Use lower/upper bound variants.
- Is problem about optimization or feasibility? → Consider Binary search on answer or parameter space.
Formalised Algos #
Styles, Recipes, Boilerplate #
Basic Boilerplates #
precursor handwritten framework and the
bisectversionsNOTE: this boiler plate is pure python that doesn’t use the stdlib values. This is just for cases where we are asked to implement it by hand
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54def binary_search(nums: List[int], target: int) -> int: # set left and right indexes (inclusive indices shown here) left, right = 0, len(nums) - 1 while left <= right: # NOTE: the condition here is for an overrun because they are inclusive indices mid = left + (right - left) // 2 # NOTE: this avoids overflows if nums[mid] < target: left = mid + 1 elif nums[mid] > target: right = mid - 1 elif nums[mid] == target: # found the target value return mid # target value not found return -1 def left_bound(nums: List[int], target: int) -> int: """ This searches for a left boundary and it also handles the case where we don't have the target in the array. """ # set left and right indexes, both are inclusive indices left, right = 0, len(nums) - 1 while left <= right: mid = left + (right - left) // 2 if nums[mid] < target: left = mid + 1 elif nums[mid] > target: right = mid - 1 # because inclusive indices elif nums[mid] == target: # target exists, narrow the right boundary (since we are searching for the left boundary) right = mid - 1 # determine if the target exists, these are non existing cases: if left < 0 or left >= len(nums): return -1 # determine if the left boundary found is the target value return left if nums[left] == target else -1 def right_bound(nums: List[int], target: int) -> int: # set left and right indexes left, right = 0, len(nums) - 1 while left <= right: mid = left + (right - left) // 2 if nums[mid] < target: left = mid + 1 elif nums[mid] > target: right = mid - 1 elif nums[mid] == target: # target exists, narrow the left boundary because we're still searching for the right boundary left = mid + 1 # determine if the target exists -- these are the out-of-bounds cases to handle. if right < 0 or right >= len(nums): return -1 # determine if the right boundary found is the target value return right if nums[right] == target else -1Alternatively, the same 3 things can be done using python’s bisect module:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23import bisect def binary_search(nums: List[int], target: int) -> int: # bisect_left to find leftmost occurrence of target pos = bisect.bisect_left(nums, target) # Verify pos is within range and equals target if pos != len(nums) and nums[pos] == target: return pos return -1 def left_bound(nums: List[int], target: int) -> int: # bisect_left will give leftmost insertion point pos = bisect.bisect_left(nums, target) if pos != len(nums) and nums[pos] == target: return pos return -1 def right_bound(nums: List[int], target: int) -> int: # bisect_right will give rightmost insertion point, so right boundary is pos - 1 pos = bisect.bisect_right(nums, target) if pos > 0 and nums[pos - 1] == target: return pos - 1 return -1
Recipes #
- Math Recipes on Python: Doing division:
integer division / floor division:
a // breturns an intceiling division:
-(-a//b)we double negate it, first to carry out a floor division on the negated form, then to negate it back
normal division:
a / breturns a float, preserving the accuracy
TODO KIV #
[ ] Tough canonical types that haven’t been touched on: #
TODO Pending canonical questions: #
finding min/max satisfying conditions this entire category hasn’t been touched before.
Lower bound / upperbound variants This category of lower bound / upper bound variants has NOT been addressed yet.
Binary Search with Custom Predicates / Conditions:: Aggressive Cows writeup in leeetcode community (placing cows with minimum distance) seems like it’s actually Magnetic Force Between Two Balls (1552)
This canonical type doesn’t have anything that was done prior to this.
Difficult canonicals:
- More questions under the searching in implicit / virtual domains
- More questions under the Bitwise Binary Search
More canonicals to explore #
Search in Infinite or Unknown Length Arrays
Requires boundary doubling or exponential probing before classic binary search.
Example: Search in a Sorted Array of Unknown Size (Leetcode 702).
Peak Finding (1D and 2D)
Find a local maximum in an array/matrix; can use binary search due to monotonic property in neighbors.
Example: Find Peak Element (Leetcode 162), Find a Peak Element II (Leetcode 1901).
Search in Sparse or Special-Format Arrays
Adaptation for arrays with lots of nulls/blanks, or unknown spacing between entries.
Example: Sparse Search (Cracking the Coding Interview, not a standard LeetCode problem).
Square Root, Cube Root Approximations
Binary search for finding floor/ceiling of root values (for large inputs).
Example: Sqrt(x) (Leetcode 69).
Binary Search on Continuous Real Intervals
For precision or geometry problems, apply classic search to floating-point intervals.
Example: Kth Smallest Prime Fraction (Leetcode 786).
Minimum/Maximum Capacity Without Sorting
Push the bounds further for ship, chamber, core, etc., capacities required under constraints.
Example: Minimum Number of Days to Make m Bouquets (Leetcode 1482), Minimum Limit of Balls in a Bag (Leetcode 1760).
Binary Search on Trees/Graphs
Typical when binary search is used in conjunction with tree structures.
Example: Closest Binary Search Tree Value II (Leetcode 272).
“K-th Element” by Count
Binary search the answer, counting by predicate or via other methods.
Example: Kth Smallest Element in a Sorted Matrix (Leetcode 378).
Median of Stream/Sliding Window
Median finding with binary search combined with sliding windows and data structures.
Example: Sliding Window Median (Leetcode 480).