Skip to main content
  1. References/
  2. Algo Reference/
  3. Canonicals/

Topic 14: Greedy Algorithms

··3777 words·18 mins

Key-Intuition: Greedy algorithms rely on the optimal choice property and local decisions leading to global optima.

Canonical Question Types #

Reachability and Jumping Games #

  • Jump from indices or points under certain conditions

Problems #

  1. Jump Game (55) :completed:

    This is somewhat easy to reason with, we have a goal and we work backwards from the goal as much as possible and finally check if we have reached the start. The greedy choice property is that if I take a step as soon as I find it, that step will be at least one option / subset from the set of options I can, I can eventually build towards the goal. Best optimal local action is just to take a viable candidate as soon as we see it.

    this avoids the exploration of ALL possibilities and gives us AN optimal solution to our problem.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    
                class Solution:
                def canJump(self, nums: List[int]) -> bool:
                        goal = len(nums) - 1
    
                        # start from left of the goal
                        for i in range(len(nums) - 2, -1, -1):
                        # if I can find a jmp source, I take it:
                        if i + nums[i] >= goal:
                                goal = i
    
                        return goal == 0
    
  2. Jump Game II (45) :completed:

    This differs a little, the greedy choice property is to take the best jump based on the frontier that we can currently reach (which is what current_end represents in the code below)

    the clever part about this is that we need to think about the main goal and think about proxy ways that the goal can be rephrased. In this case, the main goal is to be able to make the least jumps and reach the end. A rephrasing means that we need each jump to make the furthest future progress as possible.

    Applying the greedy framework

    • Problem Restatement:

      • Find minimum number of jumps to reach the last index.
    • Greedy Choice Property:

      • At every jump, pick next index from reachable positions that allows furthest future progress.
    • Optimal Substructure:

      • If minimum jumps to index i is known, then the jumps to reachable next indices from i can be minimized greedily.
    • Decision Making:

      • Instead of exploring all paths, keep track of max reachable index within current jump; when you reach current jump boundary, increment jump count, update boundary to furthest reachable.
    • Global Optimal:

      • Greedy steps collectively ensure minimal jumps since you never miss an opportunity to jump to a position enabling maximal progress.
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    
        class Solution:
            def jump(self, nums: List[int]) -> int:
                """
                Each local choice i need to choose something that gives me the rightmost advancement. I can sweep onto the current "frontier"/"current_end" and pick that best choice after exploring all choices.
    
    
                So this is like a sweeping search pointer then decision-maker pointer.
                """
                if len(nums) == 1:
                    return 0
    
                jumps = 0
                current_end = 0     # End of current jump coverage
                furthest = 0        # Furthest reachable index so far
    
                for i in range(len(nums) - 1):  # no need to jump from last element
                    furthest = max(furthest, i + nums[i])
    
                    # When reach the boundary of current jump range
                    if i == current_end:
                        jumps += 1
                        current_end = furthest
    
                        if current_end >= len(nums) - 1:
                            break
    
                return jumps
    

Resource Allocation & Circular Trips #

  • Distribute resources or find valid circular trips

Problems #

  1. Gas Station (134) :completed:

    The question actually helps to make the solutioning simpler for us (e.g. by saying that there’s a single unique solution). We have to do two things: (i) check global feasibility and (ii) check what should be the starting idx

    for (i) we just keep accumulating and the final sum has to be positive

    for (ii) we know that if at any point in time, the running sum is negative, then the previous starting index won’t work (between i (the exploration pointer) and 0), so just keep starting fresh (i + 1) if the tank goes negative.

    This approach is similar to the kadane algorithm in its essence.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    
        class Solution:
            def canCompleteCircuit(self, gas: List[int], cost: List[int]) -> int:
                # total_balance: we're just accumulating for all the gas stations, if the total_balance at the end of the accumulation is negative then the journey is impossible.
                # curr_balance is just a curr accumulator
                total_balance, curr_balance = 0, 0
                # best candidate yet (may not be useful if the journey is impossible)
                best_start = 0
    
                for i in range(len(gas)):
                    diff = gas[i] - cost[i]
                    total_balance += diff
                    curr_balance += diff
    
                    # if impossible then all from best_start to i are impossible, so we restart the joruney from the next one (though we don't actually make the journey)
                    if curr_balance < 0:
                        best_start = i + 1
                        curr_balance = 0
    
                return best_start if total_balance >= 0 else -1
    
        # alternative language:
        class Solution:
            def canCompleteCircuit(self, gas: List[int], cost: List[int]) -> int:
                total_tank = 0         # net gas over the whole journey
                current_tank = 0       # net gas from the current starting candidate
                start_index = 0        # candidate starting point
    
                for i in range(len(gas)):
                    gain = gas[i] - cost[i]
                    total_tank += gain
                    current_tank += gain
    
                    # If we can't reach the next station, start from i + 1
                    if current_tank < 0:
                        start_index = i + 1
                        current_tank = 0
    
                return start_index if total_tank >= 0 else -1
    

    General Greedy Problem-solving Steps:

    1. Formulate the subproblem:

      • Can I make a locally optimal decision (e.g., when to reset my starting point) that leads to the global optimum (successful circuit)?
    2. Greedy choice property:

      • If I fail at curr_balance < 0, any previous candidate in this segment can’t be the start—reset is safe.
    3. Optimal substructure:

      • Once you fail, the next possible successful start is strictly after your failure point.
    4. Iterate with this rule:

      • Reset candidate each time running tank falls negative, while maintaining total balance.
    5. Check feasibility:

      • Only possible if total gas ≥ total cost.

Sorting and Matching #

  • Greedy sorting based on problem-specific criteria and matching elements

Problems #

  1. ⭐️ Partition Labels (763) :completed: This problem fundamentally relies on greedy expansion and two-pointer/caterpillar technique. It can also be viewed as interval partitioning, where intervals are letter occurrences. The solution merges overlapping intervals greedily.

    This is really a 2 pointer caterpillar type but we can make life easy by knowing when is the last idx when a particular character appears. This can help us determine when to segment (and how many within the segment).

    simpler than it looks actually, the main thing is the “greedy” part which is we need to expand then consume like a caterpiller, how far to expand? until all within this window don’t have their last occurrence somewhere outside of the window. It’s like some sort of linear dependency culling thing going on.

    Greedy framework:

    Greedy Choice: At any point, expand the partition to include the last occurrence of the current character.

    Feasibility Check: You can end a partition only when all letters inside it don’t appear beyond the partition end.

    Optimal Substructure: The global optimum (max partitions) arises from correctly identifying all partition boundaries greedily without backtracking.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    
        from typing import List
    
        class Solution:
            def partitionLabels(self, s: str) -> List[int]:
                # we can save time by doing the pre-calculations, this dictcomp will just keep getting updated and give us the last idx
                last_occurrence = {c: i for i, c in enumerate(s)}
                partitions = []
    
                # use two pointes, move like a caterpillar
                start = 0
                end = 0
    
                for i, c in enumerate(s):
                    end = max(end, last_occurrence[c])
                    if i == end: # no more to consider, we can segment this off
                        size = i - start + 1
                        partitions.append(size)
                        start = i + 1
    
                return partitions
    
  2. TODO Candy (135)

  3. TODO Assign Cookies (455)

Maximum/Minimum Subarray and Kadane’s Framework #

  • Problems involving maximum or minimum contiguous sums using local greedy decisions

Problems #

  1. Maximum Subarray (53) :completed:

    This is literally max subarray sum (kadane’s algo)

    Since we are looking for subarrays ( contiguous, non-empty ), we immediately try to see if we can make some local decisions that lead up to global optima. Greedy choice property here is that we accumulate ONLY if the diff is not 0, so if the total sum becomes negative, then we can start fresh. This is Kadane’s Algo, classic stuff.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    
        class Solution:
            def maxSubArray(self, nums: List[int]) -> int:
                best = -float('inf')
                curr_accum = 0
                for num in nums:
                    # greedy choice: I make the best choice possible
                    new_accum = curr_accum + num
                    best = max(best, new_accum)
                    # choose A: start from scratch or B: carry on
                    curr_accum = 0 if new_accum < 0 else new_accum
    
                return  best
    

    For discussion sake, the question extension asks us to consider a divide-and-conquer approach. We have to realise that we can do left, right or max subarray cases: A: max subarray is entirely in the left half B: max subarray is entirely in the right half C: max subarray spans the left and right half, across the middle point.

    We just do partial sums for them and find our solution:

Interval Coverage and Partitioning #

  • Partitioning sequences or intervals using greedy local expansions

Problems #

  1. ⭐️ Partition Labels (763) :completed: This problem fundamentally relies on greedy expansion and two-pointer/caterpillar technique. It can also be viewed as interval partitioning, where intervals are letter occurrences. The solution merges overlapping intervals greedily.

    This is really a 2 pointer caterpillar type but we can make life easy by knowing when is the last idx when a particular character appears. This can help us determine when to segment (and how many within the segment).

    simpler than it looks actually, the main thing is the “greedy” part which is we need to expand then consume like a caterpiller, how far to expand? until all within this window don’t have their last occurrence somewhere outside of the window. It’s like some sort of linear dependency culling thing going on.

    Greedy framework:

    Greedy Choice: At any point, expand the partition to include the last occurrence of the current character.

    Feasibility Check: You can end a partition only when all letters inside it don’t appear beyond the partition end.

    Optimal Substructure: The global optimum (max partitions) arises from correctly identifying all partition boundaries greedily without backtracking.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    
        from typing import List
    
        class Solution:
            def partitionLabels(self, s: str) -> List[int]:
                # we can save time by doing the pre-calculations, this dictcomp will just keep getting updated and give us the last idx
                last_occurrence = {c: i for i, c in enumerate(s)}
                partitions = []
    
                # use two pointes, move like a caterpillar
                start = 0
                end = 0
    
                for i, c in enumerate(s):
                    end = max(end, last_occurrence[c])
                    if i == end: # no more to consider, we can segment this off
                        size = i - start + 1
                        partitions.append(size)
                        start = i + 1
    
                return partitions
    

Frequency Counting and Constructive Greedy (Group Formation, Coverage) #

  • Use frequency counts to form groups or satisfy constraints greedily

Problems #

  1. Hand of Straights (846) :completed:

    this is really just a grouping exercise and we use frequency counter to assist us. Frequency counting is sound because we have the “consecutive card” constraint, so we don’t need to do PQ-approaches, we just need to make sure that if card value of j has m freq, then card value of j + 1, (j + 2), (j + 3) + … must have at least m each, else we know that it won’t work.

    remember that the sorting helps us out for the consecutive argument.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    
    
        from collections import Counter
    
        class Solution:
            def isNStraightHand(self, hand: List[int], groupSize: int) -> bool:
    
                # early returns if nondivisible
                if len(hand) % groupSize != 0:
                    return False
    
                count = Counter(hand)
    
                for card in sorted(count):
                    # can start from this:
                    if count[card] > 0:
                        # if consecutive, then we at least need this many next cards:
                        required = count[card]
                        # since it's consecutive, we can do (card + groupSize) as the end range
                        for next_card in range(card, card + groupSize):
                            if count[next_card] < required:
                                return False
    
                            count[next_card] -= required
                            # this allows the next_card to be possibly reused as as starting card
    
                return True
    

    Framing into greedy context:

    Greedy decision: Always use the smallest available card to start a group of consecutive cards.

    Why smallest?

    Because completing groups starting from smaller cards prevents “blocking” larger cards later that must be consecutive.

    Local optimal choice:

    At any point, if you have leftover cards of the smallest value, you must include them in a sequence starting now to avoid leftover partial groups.

    Global optimality:

    This local choice guarantees no leftover cards prevent creating proper groups down the line.

    This fits the greedy choice property and optimal substructure:

    Greedily constructing consecutive sequences starting from the smallest card makes the remainder problem similar but smaller.

    The solution to the smaller problem leads to the global solution.

Merge and Element-wise Coverage #

  • Coverage checking and pruning in multidimensional greedy problems

Problems #

  1. Merge Triplets to Form Target Triplet (1899) :completed:

    Idea here is that we don’t need to simulate the merging, we just need to know if it’s feasible or not to reach the target (coverage). So we just need to comb through and consider merge candidates (which are just:

    1. ignore the entire triple if any of the individual values exceed the corresponding target value) \(\implies\) a strategy to prune choices

    2. mark 3 flags to check if we have reached out targets, and if so, to return early:

       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      22
      23
      24
      
                 class Solution:
                     def mergeTriplets(self, triplets: List[List[int]], target: List[int]) -> bool:
                         tx, ty, tz = target
                         found_x, found_y, found_z = False, False, False
      
                         for x, y, z in triplets:
                             # If all coordinates are already covered, we can return early
                             if found_x and found_y and found_z:
                                 return True
      
                             # Skip triplets that exceed target in any coordinate
                             if x > tx or y > ty or z > tz:
                                 continue
      
                             # Mark coverage for coordinates matching target
                             if x == tx:
                                 found_x = True
                             if y == ty:
                                 found_y = True
                             if z == tz:
                                 found_z = True
      
                         # Return True only if all three target coordinates are covered
                         return found_x and found_y and found_z
      

      here’s an clean but slow version, but it’s slower becuase no early returns and has intermediate lists

       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      11
      12
      13
      
                 class Solution:
                     def mergeTriplets(self, triplets: List[List[int]], target: List[int]) -> bool:
                         filtered = [t for t in triplets if all(t[i] <= target[i] for i in range(3))]
                         covered = [False] * 3
      
                         for t in filtered:
                             if all(covered): return True
      
                             for i in range(3):
                                 if t[i] == target[i]:
                                     covered[i] = True
      
                         return all(covered)
      
      Framework StepExplanation
      Problem restatement:Can we get exactly the `target` triplet via coordinate-wise max merges of any triplets?
      Greedy choice property:We select triplets that do not exceed the target in any coordinate and cover each target coordinate exactly once.
      Optimal substructure:After choosing triplets covering each coordinate, the merged result is exactly the target.
      Algorithm:Filter triplets not exceeding target; check coverage for each coordinate by exact matches.
      Result:If all coordinates covered, answer true; else false.

Parentheses and String Validation via Range Counting or Stack-Based Greedy #

  • Validate balanced strings with wildcards using greedy interval counting or stack methods

Problems #

  1. Valid Parenthesis String (678) :completed:

    This is just 2 stacks: one for actual working for open parentheses, and the other is for wildcards (for making rectifications). we just need to know that the rectifications require us to keep track of when the wildcard was seen. We can only rectify if the wildcard came after the problem.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    
        class Solution:
            def checkValidString(self, s: str) -> bool:
                # wildcards allow us to make "mistakes"
                stack, wildcard = [], []
    
                for idx, char in enumerate(s):
                    if char == "(":
                        stack.append(idx)
    
                    if char == ")" and not stack and not wildcard:
                        return False
    
                    # rectifiable mistake
                    if char == ")" and not stack and wildcard:
                        wildcard.pop()
    
                    if char == ")" and stack:
                        stack.pop()
    
                    if char == "*":
                        wildcard.append(idx)
    
                if stack:
                    while is_correct_order:= stack and wildcard and stack[-1] < wildcard[-1]:
                        stack.pop()
                        wildcard.pop()
    
                return not stack
    

    An alternative way is to see it as greedy interval counting:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    
        class Solution:
            def checkValidString(self, s: str) -> bool:
                low = 0   # minimum unmatched '('
                high = 0  # maximum unmatched '('
    
                for ch in s:
                if ch == '(':
                        low += 1
                        high += 1
                elif ch == ')':
                        low = max(low - 1, 0)
                        high -= 1
                else:  # ch == '*'
                        low = max(low - 1, 0)  # if '*' acts as ')'
                        high += 1              # if '*' acts as '('
    
                if high < 0:
                        return False  # can't balance because too many ')'
    
                return low == 0  # balanced if minimal unmatched '(' is zero
    

    explanation:

    Intuition:

    The low and high represent a range of possible “open parenthesis counts” given the ambiguous ‘*’.

    While scanning, a ‘*’ increases uncertainty: it could be ‘(’, ‘)’, or `` (empty), so the actual open count can vary within this range.

    By maintaining these bounds, you efficiently track whether the string can possibly be balanced.

    Here’s what they mean precisely:

    low: The minimum number of unmatched open parentheses that you must have considering all interpretations of ‘’ so far (where ‘’ can act as ‘(’, ‘)’, or empty).

    It helps track the least possible open count because some ‘*’ may act as closing parentheses or empty strings.

    high: The maximum number of unmatched open parentheses that you could possibly have, again considering all ways to interpret ‘*’.

    It tracks the most open parentheses possible, if all ‘*’ are counted as ‘(’.

    How low and high get updated per character:

    When you see ‘(’:

    Both low and high increase by 1 (you definitely have one more unmatched open parenthesis).

    When you see ‘)’:

    Both low and high decrease by 1 (you close at least one open parenthesis, possibly more).

    When you see ‘*’:

    low decreases by 1 (if ‘*’ acts as ‘)’)

    high increases by 1 (if ‘*’ acts as ‘(’)

    After each step:

    If high becomes negative → too many closing parentheses → invalid → return False early.

    Make sure low never goes below zero (can’t have negative minimum unmatched opens).

    At the end, if low == 0, it means there is at least one interpretation of ‘*’ that makes the string balanced (all open parentheses matched).

    An alternative is the greedy interval counting approach:

    • Track a range [low, high] of the number of open parentheses possible at current character.

    • Update this range as you scan the string:

      • ( increases both low and high

      • ) decreases both low and high

      • * can either act as ‘(’, ‘)’, or empty, so adjust accordingly

    • If high < 0 at any point → invalid

    • At the end, if low = 0= → valid

    This greedy counting approach is harder to see at first but very elegant.

    Contextualising to the greedy framework:

    Greedy Algorithm Context & Framework

    • Problem Restatement:

      Check if a string with (, ), and * can be interpreted as a valid parentheses string considering * as either (, ), or empty.

    • Key insight:

      The * can flexibly act to correct mismatches locally.

    • Greedy property:

      Keep track of possible open parenthesis counts (range [low, high]).

    • Local decision:

      Adjust low and high based on current character assuming minimal and maximal open parentheses.

    • Global optimality:

      If at any point, it’s impossible to balance parentheses (e.g., too many )), fail early.

    • Final check: If after processing all characters, the minimal count of open parentheses can reach zero (balanced), string is valid.

Activity Selection and Interval Scheduling #

  • Select maximum number of non-overlapping intervals/events

TODO Problems #

  • (to be added)

Coin Change (Greedy Cases) #

  • Use greedy when canonical coin denominations permit optimal greedy solutions

TODO Problems #

  • (to be added)

Huffman Coding and Greedy Tree Construction #

  • Build optimal prefix codes or trees using greedy merges

TODO Problems #

  • (to be added)

SOE #

  • Incorrect greedy choice proof or counterexamples
  • Wrong tie-breaker decisions leading to suboptimal solutions
  • In cases where we are greedily rectifying problems, e.g. the “Valid Parenthesis String (678)” we know that wildcards are to be used to rectify problems, but we need to ensure that the relative ordering is kept so that the attempt at rectifying is legitimate. (we can’t use a wildcard to rectify if it came earlier).

Decision Flow #

  • Multiple feasible solutions? → Attempt greedy with sorting
  • Resource or capacity constraints? → Greedy exchange principle

Style, Tricks and Boilerplate #

Formalised Algos #

Kadane’s Algorithm for subarray sums/etc (\(O(n)\) single sweep, greedy approach) #

Kadane’s Algorithm is a dynamic programming technique and is the gold standard for maximum subarray sum problems.

A subarray is continguous region of an array. We wish to find out the max subarray based on some scoring system.

We keep a global max_so_far and a local one, max_ending_here. This is because for each element we encounter, we have 2 choices:

  • A: continue extending the current subarray
  • B: start a new subarray

how to make the choice? suppose the scoring was a positive score.

If choice A makes the running sum negative then its effect is worse than if it was a 0. So we should pick choice B at that point.

This gives us the following style, Kadane’s Algorithm:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
def max_subarray_sum(nums):
    max_so_far = float('-inf')
    max_ending_here = 0

    for num in nums:
        max_ending_here += num
        max_so_far = max(max_so_far, max_ending_here)
        if max_ending_here < 0:
            max_ending_here = 0

    return max_so_far