Skip to main content
  1. References/
  2. Algo Reference/
  3. Canonicals/

Topic 11: Graphs

··10725 words·51 mins

Graph algorithms are largely categorized into traversal, shortest paths, connectivity components, and cycle/dependency detection.

Relations, when well represented as graphs can be exploited to solve some common problems.

Canonical Question Types #

Graph Traversal (DFS & BFS) #

  • Explore graph components, find paths, count connected or island areas. Traversal could require some basic statistics / counting or something.
    • Flood filling is a great mental model to keep in mind for this.
    • Keep in mind which direction to traverse, sometimes the easier way is to do a reverse-reachability approach.

Problems #

  1. Number of Islands (200) ;; DONE

    It’s a flood fill from all possible islands and trying to reach the others. As a memory hack, just reuse the input for visited.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    
        # BFS approach to flood filling
        from collections import deque
    
        class Solution:
            def numIslands(self, grid: List[List[str]]) -> int:
                ROWS, COLS = len(grid), len(grid[0])
                DIRS = [(-1, 0), (1, 0), (0, 1), (0, -1)]
    
                if not ROWS or not COLS:
                    return 0
    
                count = 0
                # space trick: reuse input grid for visited tracking
                # helper BFS:
                def bfs(r, c):
                    queue = deque([(r,c)])
                    grid[r][c] = '0' # mark first as visited
    
                    while queue:
                        row, col = queue.popleft()
                        for dr, dc in DIRS:
                            nr, nc = row + dr, col + dc
                            if is_valid_nei:=(0 <= nr < ROWS and 0 <= nc < COLS and grid[nr][nc] == '1'):
                                grid[nr][nc] = '0' # preemptively mark to reduce search spaces, before adding to queue
                                queue.append((nr, nc))
    
                    return
    
                for r in range(ROWS):
                    for c in range(COLS):
                        if is_island:=(grid[r][c] == '1'):
                            bfs(r, c)
                            count += 1
    
                return count
    
        # DFS approach to flood filling: this is a recursive DFS that we've done here
        class Solution:
            def numIslands(self, grid: List[List[str]]) -> int:
                ROWS, COLS = len(grid), len(grid[0])
                def dfs(r, c):
                    if not (0 <= r < ROWS and 0 <= c < COLS) or grid[r][c] == '0':
                        return
                    grid[r][c] = '0'
                    for dr, dc in [ (0,1), (0,-1), (1,0), (-1,0) ]:
                        dfs(r+dr, c+dc)
                count = 0
                for r in range(ROWS):
                    for c in range(COLS):
                        if grid[r][c] == '1':
                            count += 1
                            dfs(r, c)
                return count
    

    There’s a few viable alternatives to solving this, such as using a union find approach (note the use of a dict because we wanna track tuples (for the cell coordinate)). For this, note how we only move RIGHT/DOWN for each land cell because we want to process each pair of adjacent land cells (basically each edge) once. By restricting it to only right and down neighbours, each pair is processed exactly once from one side, preventing redundant union operations and improving efficiency.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    
        class Solution:
            def numIslands(self, grid: List[List[str]]) -> int:
                rows, cols = len(grid), len(grid[0])
                parent = {}
                rank = {}
    
                def find(x):
                    if parent[x] != x:
                        parent[x] = find(parent[x])  # path compression
                    return parent[x]
    
                def union(x, y):
                    rootX, rootY = find(x), find(y)
                    if rootX != rootY:
                        if rank[rootX] > rank[rootY]:
                            parent[rootY] = rootX
                        elif rank[rootX] < rank[rootY]:
                            parent[rootX] = rootY
                        else:
                            parent[rootY] = rootX
                            rank[rootX] += 1
    
                # Initialize parent and rank for land cells
                for r in range(rows):
                    for c in range(cols):
                        if grid[r][c] == '1':
                            parent[(r, c)] = (r, c)
                            rank[(r, c)] = 0
    
                # Union adjacent land cells (right and down to avoid duplicates)
                for r in range(rows):
                    for c in range(cols):
                        if grid[r][c] == '1':
                            for dr, dc in [(0, 1), (1, 0)]:
                                nr, nc = r + dr, c + dc
                                if 0 <= nr < rows and 0 <= nc < cols and grid[nr][nc] == '1':
                                    union((r, c), (nr, nc))
    
                # Count distinct roots representing islands
                return len(set(find(x) for x in parent))
    
  2. Number of Provinces (547)

    We can choose to solve this in 2 ways: a union find approach or a bfs-based flood fill approach for connectivity tracking.

    Here’s the union find approach:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    
       class Solution:
           def findCircleNum(self, isConnected: List[List[int]]) -> int:
               """
               We have to just do union find and get the total number of unique parents / roots.
    
               isConnected is about direct connections. This means direct edges between them.
               We can get started.
               """
               n = len(isConnected)
               parent = list(range(n))
               rank = [0] * n
    
               def find(x):
                   while parent[x] != x:
                       parent[x] = parent[parent[x]]
                       x = parent[x]
                   return x
    
               def union(x, y):
                   """
                   Only side-effects matter for us here.
                   """
                   rx, ry = find(x), find(y)
                   if rank[rx] < rank[ry]:
                       parent[rx] = ry
                   elif rank[ry] < rank[rx]:
                       parent[ry] = rx
                   else:
                       parent[ry] = rx
                       rank[rx] += 1
    
               for a in range(n):
                   for b in range(a + 1, n):
                       if isConnected[a][b]:
                           union(a, b)
    
               return len({find(x) for x in range(n)})
    

    Here’s a BFS approach, which is more performant:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    
       from collections import deque
       class Solution:
           def findCircleNum(self, isConnected: List[List[int]]) -> int:
               """
               We can do a connectivity traversal using a BFS flood fill approach, then we can just count the number of islands  / provinces.
               """
               n = len(isConnected)
               provinces = 0
               visited = [0] * n # 0: univisited, 1: visited
    
               # now we run the BFS originating from any unvisited:
               for i in range(n):
                   if not visited[i]:
                       provinces += 1
                       queue = deque([i])
                       visited[i] = 1
                       while queue:
                           node = queue.popleft()
                           for nei in (nei for nei in range(n) if isConnected[node][nei] and not visited[nei]):
                               queue.append(nei)
                               visited[nei] = 1
               return provinces
    
  3. Max Area of Island (695) ;; DONE This is just accumulating statistics on a flood fill approach to traversing islands. There’s a bunch of ways we can do this: flood fill (BFS/DFS) or via a union find. BFS is my favourite in general.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    
       # M1: FLOOD FILL BFS
        from collections import deque
    
        class Solution:
            def maxAreaOfIsland(self, grid: List[List[int]]) -> int:
                if not grid or not grid[0]:
                    return 0
    
                area = 0
    
                ROWS, COLS = len(grid), len(grid[0])
                directions = [(0, 1), (0, -1), (1, 0), (-1, 0)]
    
                for r in range(ROWS):
                    for c in range(COLS):
                        if grid[r][c] == 1: # we start a BFS for more land-cells.
                            curr_area = 0
                            # use as root and carry on:
                            queue = deque([(r, c)])
                            grid[r][c] = 0 # space hack: use the input and mark as visited
                            curr_area += 1
                            while queue:
                                r, c = queue.popleft()
                                valid_children = ((r + dr, c + dc) for dr, dc in directions if 0 <= r + dr < ROWS and 0 <= c + dc < COLS and  grid[r + dr][c + dc] == 1 )
    
                                for child_row, child_col in valid_children:
                                    # update accumulators (curr_area, visited) before adding to the queue
                                    grid[child_row][child_col] = 0
                                    curr_area += 1
    
                                    # enqueue
                                    queue.append((child_row, child_col))
    
                            area = max(area, curr_area) # helps us find the biggest island in the grid
    
                return area
    
        # M2: Flood fill DFS
        class Solution:
            def maxAreaOfIsland(self, grid: List[List[int]]) -> int:
                max_area = 0
                rows, cols = len(grid), len(grid[0])
                for r in range(rows):
                    for c in range(cols):
                        if grid[r][c] == 1:
                            stack = [(r, c)]
                            grid[r][c] = 0
                            area = 1 # local area, for this island's measure
                            while stack:
                                cr, cc = stack.pop()
                                for dr, dc in [(-1,0),(1,0),(0,-1),(0,1)]:
                                    nr, nc = cr+dr, cc+dc
                                    if 0 <= nr < rows and 0 <= nc < cols and grid[nr][nc] == 1:
                                        grid[nr][nc] = 0
                                        stack.append((nr, nc))
                                        area += 1
                            max_area = max(max_area, area)
                return max_area
    
        # M3: union find but it's complex and ill-fitted for this solution
        class DSU:
            def __init__(self):
                self.parent = {}
                self.size = {} # same concept as rank, we use a dict here because the keys will be coordinate tuples
            def find(self, x):
                if self.parent[x] != x:
                    self.parent[x] = self.find(self.parent[x])
                return self.parent[x]
            def union(self, x, y):
                """
                Return value here doesn't matter, we care about the side-effect only.
                """
                xr, yr = self.find(x), self.find(y)
                if xr != yr:
                    self.parent[xr] = yr
                    self.size[yr] += self.size[xr]
            def add(self, x):
                if x not in self.parent:
                    self.parent[x] = x
                    self.size[x] = 1
    
        # Inside solution, union adjacent '1' cells as you read the grid,
        # then return max(dsu.size.values()) as the answer if any, else 0.
    
  4. Flood Fill (733)

    This is pretty straightforward easy question, just need to know the early return case when the source is already coloured and there’s no need to colour anything (including neis)

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    
        from collections import deque
    
        class Solution:
            def floodFill(self, image: List[List[int]], sr: int, sc: int, color: int) -> List[List[int]]:
                ROWS, COLS = len(image), len(image[0])
                DIRS = [(0, 1), (0, -1), (-1, 0), (1, 0)]
                start_color = image[sr][sc]
                if start_color == color:
                    return image
    
                queue = deque([(sr, sc)])
                while queue:
                    r, c = queue.popleft()
    
                    image[r][c] = color
                    for nr, nc in ((r + dr, c + dc) for dr, dc in DIRS if (0 <= (r + dr) < ROWS) and (0 <= (c + dc) < COLS)):
                        if image[nr][nc] != start_color:
                            continue
                        image[nr][nc] = color
                        queue.append((nr, nc))
                return image
    
  5. Walls and Gates (??) (neetcode version) ;; DONE

    This is just distance tracking, we can go layer by layer and get things done. The aux ds (queue / stack) can hold tuples to keep track of parent height. Oh and it’s a multi-source BFS that happens at the same time.

    This is a case of multi-source BFS really. Nothing special here, we just need to write it in-place and have nothing to return.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    
        from collections import deque
    
        class Solution:
            def islandsAndTreasure(self, grid: List[List[int]]) -> None:
                if not grid:
                    return
    
                queue = deque()
                ROWS, COLS = len(grid), len(grid[0])
                DIRS = [(0, -1), (0, 1), (1, 0), (-1, 0)]
                INF = 2147483647
    
                # 1: add all gates to queue at distance 0 (multiple source BFS, simultaneously run them)
                for r, c in ((r, c) for r in range(ROWS) for c in range(COLS) if grid[r][c] == 0):
                    # is treasure:
                    queue.append((r, c))
    
                while queue:
                    r, c = queue.popleft()
                    candidates = ((r + dr, c + dc) for dr, dc in DIRS if (0 <= r + dr < ROWS) and (0 <= c + dc < COLS) and grid[r + dr][c + dc] == INF)
                    for row, col in candidates:
                        grid[row][col] = grid[r][c] + 1
                        queue.append((row, col))
    

    Extension: if the cost of traversing an edge from node to its neighbors had a costs (positive) then we would have to consider doing some kinda of Dijkstra’s approach. In this case a simple BFS works perfectly.

  6. Rotting Oranges (994) ;; DONE

    This is a time-simulation of a flood-fill, where the current hop is the time elapsed.

    This is straightforward, we want to do multisource traversal (from each rotten orange) as a way of doing time-simulation and so we just use a simple BFS traversal. This also just keeps an accumulator aux value (fresh) collected via a pre-processing step for our ease.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    
        from collections import deque
        class Solution:
            def orangesRotting(self, grid: List[List[int]]) -> int:
                ROWS, COLS = len(grid), len(grid[0])
                queue = deque()
                fresh = 0
                # accumulate starts points for the BFS start as well as the aux variable, fresh
                for r in range(ROWS):
                    for c in range(COLS):
                        if grid[r][c] == 2:
                            queue.append((r, c, 0))
                        elif grid[r][c] == 1:
                            fresh += 1
                time = 0
                while queue:
                    r, c, t = queue.popleft()
                    for dr, dc in [(-1,0),(1,0),(0,-1),(0,1)]:
                        nr, nc = r+dr, c+dc
                        if 0 <= nr < ROWS and 0 <= nc < COLS and grid[nr][nc] == 1:
                            grid[nr][nc] = 2
                            fresh -= 1
                            queue.append((nr, nc, t+1))
                            time = t+1
                return -1 if fresh > 0 else time
    
  7. Pacific Atlantic Water Flow (417) ;; DONE

    This is a good reminder that traversal direction should be something that we should take a moment to think about before we make a conclusion. Here, we realise that the better approach is to go from outwards to in (instead of the actual flow of the water). This is similar to the point where we should consider simpler, complementary approaches to the problems.

    This is a reverse-reachability type. It’s also a multisource traversal!

    Also remember to mark things as visited ASAP so that we don’t introduce redundancies in the search space.

        from collections import deque
    
        class Solution:
            def pacificAtlantic(self, heights: List[List[int]]) -> List[List[int]]:
                # early returns for trivial edge case:
                if not heights or not heights[0]:
                    return []
    
                ROWS, COLS = len(heights), len(heights[0])
                DIRS = [(-1,0), (1,0), (0,-1), (0,1)]
    
                def bfs(starts):
                    visited = set(starts)
                    queue = deque(starts)
                    while queue:
                        r, c = queue.popleft()
                        # direct for loop is better (easier to read) than generator then for loop:
                        for dr, dc in DIRS:
                            nr, nc = r + dr, c + dc
                            # is reachable coordinate:
                            if (
                                0 <= nr < ROWS and
                                0 <= nc < COLS and
                                (nr, nc) not in visited and
                                heights[nr][nc] >= heights[r][c]
                            ):
                                # mark as visited before enqueing!
                                visited.add((nr, nc))
                                queue.append((nr, nc))
                    return visited
    
                pacific_starts = [(0, c) for c in range(COLS)] + [(r, 0) for r in range(ROWS)]
                atlantic_starts = [(ROWS-1, c) for c in range(COLS)] + [(r, COLS-1) for r in range(ROWS)]
    
                pacific_reach = bfs(pacific_starts)
                atlantic_reach = bfs(atlantic_starts)
    
                return [ [r, c] for r, c in (pacific_reach & atlantic_reach) ]
    
  8. Surrounded Regions (130) ;; DONE

    This one is about creating temporary marks as we navigate then restoring thing as part of a formatting step.

    This is also a reverse-reachability type of question because we actually find the inverse (safe cells) that we can reach from the border cells since these safe cells won’t be converted. This will help us find the actual unsafe cells that we will convert.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    
        from collections import deque
        from typing import List
    
        class Solution:
            def solve(self, board: List[List[str]]) -> None:
                if not board or not board[0]:
                    return
    
                ROWS, COLS = len(board), len(board[0])
                DIRECTIONS = [(-1, 0), (1, 0), (0, -1), (0, 1)]
    
                def get_border_o_positions():
                    """
                    Returns a set of coordinates for all the Os that are at the borders
                    """
                    # All 'O's on the border (first/last row or column)
                    top_bottom = ((row, col) for col in range(COLS) for row in (0, ROWS-1) if board[row][col] == 'O')
                    left_right = ((row, col) for row in range(ROWS) for col in (0, COLS-1) if board[row][col] == 'O')
                    return set(top_bottom) | set(left_right)
    
                def mark_border_connected_region_as_safe():
                    border_o_positions = get_border_o_positions()
                    queue = deque()
                    for r, c in border_o_positions:
                        # mark as visited before enqueing the start coordinates:
                        board[r][c] = '*'
                        queue.append((r, c))
                    while queue:
                        r, c = queue.popleft()
                        for dr, dc in DIRECTIONS:
                            nr, nc = r + dr, c + dc
                            if (0 <= nr < ROWS and 0 <= nc < COLS and board[nr][nc] == 'O'):
                                board[nr][nc] = '*'
                                queue.append((nr, nc))
    
                mark_border_connected_region_as_safe()
    
                # List comprehensions for postprocessing
                for r, c in ((r, c) for r in range(ROWS) for c in range(COLS)):
                    if board[r][c] == 'O':
                        board[r][c] = 'X'
                    elif board[r][c] == '*':
                        board[r][c] = 'O'
    
  9. Minimum Jumps to Reach End via Prime Teleportation (3629)

    Main idea is that we need to do a bunch of preprocessing to make the BFS efficient. The idea therefore is to do both prime sieving AND keeping track of prime factors.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    
        from math import sqrt
    
        class Solution:
            def minJumps(self, nums: List[int]) -> int:
                n = len(nums)
                max_val = max(nums)
    
                def sieve(limit):
                    """
                    Returns a list of prime numbers within limit (inclusive)
                    """
                    is_prime = [True] * (limit + 1) # effectively makes this 1-indexed
                    is_prime[0], is_prime[1] = False, False
                    # remember we only need to do till sqrt of the limit
                    for num in range(2, int(sqrt(limit) + 1)):
                        if is_prime[num]:
                            # negate its multiples:
                            for j in range(num * num, limit + 1, num):
                                is_prime[j] = False
    
                    return is_prime
    
                is_prime = sieve(max_val) # is the reference prime-sieve
    
                def prime_factors(num):
                    """
                    Returns a set of prime factors for a number:
                    """
                    factors = set()
                    temp = num
                    # prime factorisation
                    for i in range(2, int(sqrt(num)) + 1):
                        if not is_prime[i]:
                            continue
    
                        while temp % i == 0:
                            factors.add(i)
                            temp //= i
    
                        if temp == 1:
                            break
    
                    if temp > 1: # then temp is a prime too
                        factors.add(temp)
    
                    return factors
    
                # creates a map from prime factor to list of indices in nums with number divisible by that prime, these are the teleport indices
                prime_to_indices = defaultdict(list)
                for idx, num in enumerate(nums):
                    factors = prime_factors(num)
                    for f in factors:
                        prime_to_indices[f].append(idx)
    
    
                # now we do the BFS with hop counting:
                visited = [False] * n
                visited[0] = True
                visited_prime = set() # track prime whose teleportations are visited
                q = deque([(0, 0)]) # (idx, hops)
                while q:
                    idx, hops = q.popleft()
    
                    # reached the end:
                    if idx == n - 1:
                        return hops
    
                    # adjacent moves:
                    for nei in [idx - 1, idx + 1]:
                        if 0 <= nei < n and not visited[nei]:
                            visited[nei] = True
                            q.append((nei, hops + 1))
    
                    # teleports:
                    num = nums[idx]
                    if is_prime[num] and num not in visited_prime:
                        visited_prime.add(num)
                        for nei in prime_to_indices[num]:
                            if not visited[nei]:
                                visited[nei] = True
                                q.append((nei, hops + 1))
    
                return -1
    

    The main part is that if we have the possibilities lookup via something like this, then we can use it for our BFS and then the least hop reach to the final cell is our answer that we seek.

    1
    2
    3
    4
    5
    6
    
                # creates a map from prime factor to list of indices in nums with number divisible by that prime, these are the teleport indices
                prime_to_indices = defaultdict(list)
                for idx, num in enumerate(nums):
                    factors = prime_factors(num)
                    for f in factors:
                        prime_to_indices[f].append(idx)
    
  10. Find if Path Exists in Graph (1971)

    ended up being a basic traversal to check for reachability from source to destination, given edges.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    
        from collections import defaultdict, deque
        class Solution:
            def validPath(self, n: int, edges: List[List[int]], source: int, destination: int) -> bool:
                """
                We could directly do a graph traversal.
                """
                adj = defaultdict(list)
                for u, v in edges:
                    adj[u].append(v)
                    adj[v].append(u)
    
                visited = set()
                q = deque([source])
                while q:
                    node = q.popleft()
                    if node == destination:
                        return True
                    for nei in adj[node]:
                        if nei not in visited:
                            visited.add(nei)
                            q.append(nei)
    
                return False
    

Shortest Path (Unweighted Graph) #

  • BFS-based shortest path finding in unweighted graphs (fixed cost). If there were costs involved then we could explore a Bellman-Ford / Dijkstra’s approach.

Problems #

  1. Word Ladder (127)

    This is a shortest path problem that feels like an edit distance kind, but it’s a graph traversal. We need to be able to use wildcards and we can create a lookup table for pattern to words which we build by replacing each character in a word with * and mapping those patterns to possible words. Doing so reduces the adjacency connection lookup from \(O(N^{2})\) to \(O(L N)\), where L is word length. Also take note that we only want to reach each node (pattern) once in the overall procedure, so we silence it after we reach it.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    
        from collections import defaultdict, deque
    
        class Solution:
            def ladderLength(self, beginWord: str, endWord: str, wordList: List[str]) -> int:
                if endWord not in wordList:
                    return 0
    
                L = len(beginWord)
    
                # better mapping:
                pattern_to_words = defaultdict(list)
                for word in wordList:
                    for i in range(L):
                        pattern = word[:i] + "*" + word[i + 1:]
                        pattern_to_words[pattern].append(word)
    
                # now we run BFS:
                queue = deque([(beginWord, 1)]) # word, level
                visited = {beginWord}
                while queue:
                    word, level = queue.popleft()
                    for i in range(L):
                        pattern = word[:i] + "*" + word[i + 1:]
                        for nei in pattern_to_words[pattern]:
                            if nei == endWord:
                                return level + 1
                            if nei not in visited:
                                visited.add(nei)
                                queue.append((nei, level + 1))
    
                        # NB: prevents repeat lookups, we silence that pattern check so that each pattern is used only once per search.
                        pattern_to_words[pattern] = []
    
                return 0
    

    In some ways, this question involves a graph expansion / modification as well. This is implicit in our use of pattern_to_words.

Topological Sort & Cycle Detection #

  • Detect cycles in directed or undirected graphs and generate valid node orderings in DAGs

Problems #

  1. Course Schedule (207) ;; DONE

    This is a simpler version of the problem that just expects a boolean return on whether we can do a topo sort (i.e. whether it’s an acyclic DAG). There’s 2 ways to do this: graph colouring (to ensure no cycles) or just straight up doing kahn’s algo:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    
        # M1: GRAPH COLORING DFS APPROACH (3-colour graph (0: unvisited: 1: visiting 2: visited))
        class Solution:
            def canFinish(self, numCourses: int, prerequisites: List[List[int]]) -> bool:
                # list of lists, access using relative idx
                adj = [[] for _ in range(numCourses)]
    
                # careful: we are ultimately trying to clear the topo sort, so we are building edge from a -> b if a is a prereq of b
                for course, pre in prerequisites:
                    adj[pre].append(course)
    
                # single visited with 3 states, space optimisation!
                visited = [0] * numCourses  # 0 = unvisited, 1 = visiting, 2 = visited
    
                def dfs(node):
                    if visited[node] == 1:  # Cycle!
                        return False
    
                    if visited[node] == 2: # done state, can return early
                        return True
    
                    visited[node] = 1
                    for nei in adj[node]:
                        if not dfs(nei):
                            return False
    
                    visited[node] = 2
    
                    return True
    
                for i in range(numCourses):
                    if not dfs(i):
                        return False
                return True
    
        # M2: Kahn's algo used to do cycle-checks:
        from collections import defaultdict, deque
    
        class Solution:
            def canFinish(self, numCourses: int, prerequisites: List[List[int]]) -> bool:
                # Build adjacency list and in-degree array
                adj = defaultdict(list)
                # in-degree: For each node, count how many edges point to it = how many prerequisites it has.
                in_degree = [0] * numCourses
    
                for course, pre in prerequisites:
                    adj[pre].append(course)      # prereq → course (edge creation)
                    in_degree[course] += 1       # course needs one more prereq
    
                # 1. Find all courses with no prerequisites. Any element in the queue is "ready to be taken"
                queue = deque(i for i in range(numCourses) if in_degree[i] == 0)
                visited = 0                      # How many courses we've "finished"
    
                # 2. Process all courses "ready" to be taken
                while queue:
                    node = queue.popleft()
                    # count as finished / visited
                    visited += 1
    
                    for neighbor in adj[node]:   # All courses that depend on this one
                        in_degree[neighbor] -= 1 # One prereq is now done!
                        if in_degree[neighbor] == 0: # this neighbor is not ready to be taken
                            queue.append(neighbor)
    
                # 3. If we managed to "take" all courses, there is no cycle!
                return visited == numCourses
    
  2. Course Schedule II (210) ;; DONE

    This is a natural extension where we need to generate the actual topologically sorted array of nodes. Similarly, we can use Kahn’s Algo or do a DFS-based graph colouring for our purpose (just do a post-order inclusion). For the DFS approach, remember that the way we choose to mark visited will end up affecting whether or not we might need to reverse the schedule. I just prefer that Kahn’s Algo.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    
        # M1: Using Kahn's Algo
        from collections import defaultdict
        class Solution:
            def findOrder(self, numCourses: int, prerequisites: List[List[int]]) -> List[int]:
                adj = defaultdict(list)
    
                indegree = [0] * numCourses
    
                # calculate the indegrees:
                for course, pre in prerequisites:
                    adj[pre].append(course)
                    indegree[course]+= 1
    
                # find no prereqs:
                # ready-queue
                queue = deque([course for course in range(numCourses) if indegree[course] == 0])
    
                schedule = [] # NB: we can use the length of this as "num visited" to check for cycle, no aux counter variable necessary.
    
                while queue:
                    node = queue.popleft()
                    schedule.append(node)
    
                    for neighbor in adj[node]:
                        indegree[neighbor] -= 1
                        # neighbour ready to be taken:
                        if indegree[neighbor] == 0:
                            queue.append(neighbor)
    
                # if cycle found:
                if not len(schedule) == numCourses:
                    return []
    
                return schedule
    
        # M2: here's the DFS graph colouring approach:
        class Solution:
            def findOrder(self, numCourses, prerequisites):
                # Preprep:
                # NOTE: maps prereq -> course
                adj = [[] for _ in range(numCourses)]
                for course, pre in prerequisites:
                    adj[pre].append(course)
                state = [0] * numCourses  # 0=unvisited, 1=visiting, 2=visited
                order = []
                def dfs(node):
                    # detected cycle in the same round!
                    if state[node] == 1:
                        return False
                    if state[node] == 2:
                        return True
    
                    # mark new as visiting:
                    state[node] = 1
                    for nei in adj[node]:
                        # explore depth in neighbours:
                        if not dfs(nei):
                            return False
                    # finally, mark as visited -- this node is ready to be taken
                    state[node] = 2
                    # mark this course as taken
                    order.append(node)
    
                    return True
    
                for i in range(numCourses):
                    # if unvisited:
                    if state[i] == 0:
                        if not dfs(i):
                            return []
    
                return order[::-1] # remember to reverse because it's a postorder accumulation that we have done here.
    
  3. Graph Valid Tree (??) (neetcode link) ;; DONE (connected + no cycle)]

    This is a traversal on an undirected graph to check if it’s a tree. We can use the basic math properties for early returns (n - 1 edges in a tree for a tree of n nodes). In this case, we also handle the undirected nature by splitting it into two directed edges (we also need to do the parent tracking for this!). This is really just a cycle check.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    
        from collections import defaultdict
    
        class Solution:
            def validTree(self, n: int, edges: List[List[int]]) -> bool:
                # early return based on math: there should be n - 1 edges for a tree with n nodes
                if len(edges) != n - 1:
                    return False    # must be exact number of edges
    
                adj = defaultdict(list)
                for a, b in edges:
                    adj[a].append(b)
                    adj[b].append(a)
    
                visited = set()
                stack = [(0, -1)]  # (current, parent)
    
                while stack:
                    node, parent = stack.pop()
                    if node in visited:
                        return False    # cycle detected
                    visited.add(node)
                    for nei in adj[node]:
                        if nei == parent:
                            continue    # don't backtrack
                        stack.append((nei, node))
    
                return len(visited) == n
    
  4. Sort Items by Groups Respecting Dependencies (1203)

    This is a higher order question whereby we need to topo sort by 2 dimensions: topo sorting within items, topo sorting of groups. This gives us a 2-phased approach.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    
        from collections import defaultdict, deque
        from typing import List
    
        class Solution:
            def sortItems(self, n: int, m: int, group: List[int], beforeItems: List[List[int]]) -> List[int]:
                # Step 0: Assign unique group IDs to ungrouped items, keep them in unique groups
                new_gid = m
                for i in range(n):
                    if group[i] == -1:
                        group[i] = new_gid
                        new_gid += 1
    
                # Step 1: Prepare adjacency lists + indegrees
                group_adj, group_indeg = defaultdict(list), defaultdict(int)
                item_adj, item_indeg = defaultdict(list), defaultdict(int)
    
                # Step 2: Build both graphs
                for item in range(n):
                    for prev_item in beforeItems[item]:
                        # Build item dependency graph
                        item_adj[prev_item].append(item)
                        item_indeg[item] += 1
    
                        # Build group dependency graph, only if they belong to different groups
                        if group[prev_item] != group[item]:
                            src, dest = group[prev_item], group[item]
                            group_adj[src].append(dest)
                            group_indeg[dest] += 1
    
                # Step 3: Topological sort helper
                def topo(nodes, adj, indeg):
                    q = deque(x for x in nodes if indeg[x] == 0)
                    res = []
                    while q:
                        u = q.popleft()
                        res.append(u)
                        for nei in adj[u]:
                            indeg[nei] -= 1
                            if indeg[nei] == 0:
                                q.append(nei)
                    return res if len(res) == len(nodes) else []
    
                # Step 4: Get group order & item order
                group_nodes = list(range(new_gid))
                group_order = topo(group_nodes, group_adj, group_indeg.copy()) # GOTCHA: remember to copy over else the reference indegree may be mutated
                if not group_order:
                    return []
    
                item_nodes = list(range(n))
                items_order = topo(item_nodes, item_adj, item_indeg.copy())
                if not items_order:
                    return []
    
                # Step 5: Arrange items by group according to items_order
                group_to_items = defaultdict(list)
                for item in items_order:
                    group_to_items[group[item]].append(item)
    
                # Step 6: Build the final result
                res = []
                for g in group_order:
                    res.extend(group_to_items[g])
                return res
    

    Another big-brain approach is to use an augmented graph for this (so we have a supergraph) but this hurts my head.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    
        from collections import defaultdict, deque
        from typing import List
    
        class Solution:
            def sortItems(self, n: int, m: int, group: List[int],
                          beforeItems: List[List[int]]) -> List[int]:
                """
                Single-graph approach for 1203.
                Instead of 2 separate topo sorts (items, then groups),
                we build one unified DAG containing:
    ​                - virtual group nodes
    ​                - item nodes
                and run ONE topological sort on all of them.
                """
    
                # --- STEP 1: Assign unique group IDs to ungrouped items ---
                # Problem states group[i] == -1 if no group. We give these items
                # their own 'singleton' pseudo-group to enforce adjacency.
                new_group_id = m
                for i in range(n):
                    if group[i] == -1:
                        group[i] = new_group_id
                        new_group_id += 1
                total_groups = new_group_id  # updated count after assignment
    
                # --- STEP 2: Node indexing scheme ---
                # 0..n-1       → item nodes
                # n..n+total_groups-1 → virtual group nodes
                def group_node(g_id: int) -> int:
                    """ Map a group ID to its virtual node index in the graph """
                    return n + g_id
    
                # --- Data structures for graph + indegree ---
                G = defaultdict(list)   # adjacency list
                indeg = defaultdict(int)  # indegree dictionary
    
                # Initialize indegrees to zero for all nodes
                all_nodes = list(range(n)) + [group_node(g) for g in range(total_groups)]
                for node in all_nodes:
                    indeg[node] = 0
    
                # --- STEP 3: Connect each group node to the items it contains ---
                # This ensures items of the same group appear *after* the group's
                # node in topo order, making them contiguous in the final output.
                for item in range(n):
                    gnode = group_node(group[item])
                    G[gnode].append(item)
                    indeg[item] += 1
    
                # --- STEP 4: Add edges for dependencies ---
                # For every beforeItems[i], item i must appear AFTER prev_item.
                for item in range(n):
                    for prev_item in beforeItems[item]:
                        if group[item] == group[prev_item]:
                            # Same group → direct dependency between items
                            G[prev_item].append(item)
                            indeg[item] += 1
                        else:
                            # Different groups → dependency between group nodes
                            G[group_node(group[prev_item])].append(group_node(group[item]))
                            indeg[group_node(group[item])] += 1
    
                # --- STEP 5: Kahn's Algorithm (BFS Topological Sort) ---
                q = deque([node for node in all_nodes if indeg[node] == 0])
                topo_order = []
    
                while q:
                    u = q.popleft()
                    topo_order.append(u)
                    for v in G[u]:
                        indeg[v] -= 1
                        if indeg[v] == 0:
                            q.append(v)
    
                # --- STEP 6: Check for cycles ---
                # A cycle anywhere (items or groups) means no valid order exists.
                if len(topo_order) != len(all_nodes):
                    return []
    
                # --- STEP 7: Extract only items from topo_order ---
                # Group nodes were just helpers to enforce contiguous grouping.
                result = [node for node in topo_order if node < n]
                return result
    

Union-Find Disjoint Sets #

  • Manage dynamic connectivity, detect cycles, count components efficiently

Problems #

  1. Number of Connected Components in an Undirected Graph (323) ;; DONE

    This is traditionally a union find question but we can also use a DFS approach for this.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    
        # M1: Ideal, Union Find approach
        class Solution:
            def countComponents(self, n: int, edges: List[List[int]]) -> int:
                parent = list(range(n))
                rank = [0] * len(n) # should init to 0
    
                def find(x):
                    while parent[x] != x:
                        parent[x] = parent[parent[x]]  # path compression
                        x = parent[x]
                    return x
    
                def union(x, y):
                    rootX, rootY = find(x), find(y)
                    if rank[rootX] < rank[rootY]:
                        parent[rootX] = rootY
                        return
                    if rank[rootY] < rank[rootX]:
                        parent[rootY] = rootX
                        return
                    parent[rootX] = rootY
                    rank[rootY] += 1
    
                for a, b in edges:
                    union(a, b)
    
                return len({find(x) for x in range(n)})
    
        # M2: BFS approach:
        from collections import defaultdict, deque
    
        class Solution:
            def countComponents(self, n: int, edges: List[List[int]]) -> int:
                adj = defaultdict(list)
                for a, b in edges:
                    adj[a].append(b)
                    adj[b].append(a)
    
                visited = [False] * n
    
                def bfs(start):
                    queue = deque([start])
                    # FIX 1 : remember to mark the first entry node as visited
                    visited[start] = True
                    while queue:
                        node = queue.popleft()
                        for nei in adj[node]:
                            if not visited[nei]:
                                visited[nei] = True
                                queue.append(nei)
    
                comps = 0
                for node in range(n):
                    if not visited[node]:
                        comps += 1
                        bfs(node)
                        return comps
    

    I think it’s alright to put the bfs as a helper or within the function directly, either way works or default to what looks prettier.

  2. Redundant Connection (684) ;; DONE

    We traverse through the edges (since these are the connections) and try to union the nodes together. If they’re already joined then that edge we are currently entertaining is an extra edge.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    
        def findRedundantConnection(edges):
            n = len(edges)
            parent = [i for i in range(n+1)]
    
            def find(x):
                while parent[x] != x:
                    # set grandparent as parent
                    parent[x] = parent[parent[x]]  # path compression
                    # check grandparent
                    x = parent[x]
    
                return x
    
            def union(x, y):
                rootX, rootY = find(x), find(y)
    
                if rootX == rootY:
                    return False  # x and y are already connected
    
                parent[rootY] = rootX
                return True
    
            for u, v in edges:
                if not union(u, v):
                    return [u, v]
    

    This version is an improvement that considers ranking:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    
       class Solution:
           def findRedundantConnection(self, edges: List[List[int]]) -> List[int]:
               # we do a union find
               n = len(edges)
               parent = list(range(n + 1)) # since the nodes are 1-indexed
               rank = [0] * (n + 1)
    
               def find(u):
                   # compress parent
                   while parent[u] != u:
                       # path compress to its own parent
                       parent[u] = parent[parent[u]]
                       u = parent[u]
    
                   return u
    
               def union(x, y):
                   rx, ry = find(x), find(y)
                   if rx == ry:
                       return False # have been connected before, this is redundant
    
                   if rank[rx] < rank[ry]:
                       parent[rx] = ry
                   elif rank[ry] < rank[rx]:
                       parent[ry] = rx
                   else:
                       parent[ry] = rx
                       rank[rx] += 1
    
                   return True
    
               for u,v in edges:
                   if not union(u, v):
                       return [u, v]
    
  3. Number of Provinces (547)

    We can choose to solve this in 2 ways: a union find approach or a bfs-based flood fill approach for connectivity tracking.

    Here’s the union find approach:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    
       class Solution:
           def findCircleNum(self, isConnected: List[List[int]]) -> int:
               """
               We have to just do union find and get the total number of unique parents / roots.
    
               isConnected is about direct connections. This means direct edges between them.
               We can get started.
               """
               n = len(isConnected)
               parent = list(range(n))
               rank = [0] * n
    
               def find(x):
                   while parent[x] != x:
                       parent[x] = parent[parent[x]]
                       x = parent[x]
                   return x
    
               def union(x, y):
                   """
                   Only side-effects matter for us here.
                   """
                   rx, ry = find(x), find(y)
                   if rank[rx] < rank[ry]:
                       parent[rx] = ry
                   elif rank[ry] < rank[rx]:
                       parent[ry] = rx
                   else:
                       parent[ry] = rx
                       rank[rx] += 1
    
               for a in range(n):
                   for b in range(a + 1, n):
                       if isConnected[a][b]:
                           union(a, b)
    
               return len({find(x) for x in range(n)})
    

    Here’s a BFS approach, which is more performant:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    
       from collections import deque
       class Solution:
           def findCircleNum(self, isConnected: List[List[int]]) -> int:
               """
               We can do a connectivity traversal using a BFS flood fill approach, then we can just count the number of islands  / provinces.
               """
               n = len(isConnected)
               provinces = 0
               visited = [0] * n # 0: univisited, 1: visited
    
               # now we run the BFS originating from any unvisited:
               for i in range(n):
                   if not visited[i]:
                       provinces += 1
                       queue = deque([i])
                       visited[i] = 1
                       while queue:
                           node = queue.popleft()
                           for nei in (nei for nei in range(n) if isConnected[node][nei] and not visited[nei]):
                               queue.append(nei)
                               visited[nei] = 1
               return provinces
    

Graph Construction & Cloning #

  • Build or reconstruct graph structure from input data (edge lists, matrices)

Problems #

  1. Clone Graph (133)

    This is just a traversal (any kind will do) and keep track of a mapping between old and new nodes, that way we build a whole new graph with the correct node values. Also remember that this cloning only pertains to the nodes, the edges are just a happy side-effect.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    
        from collections import deque
        from typing import Optional
    
        class Solution:
            def cloneGraph(self, node: Optional['Node']) -> Optional['Node']:
                if not node:
                    return None
    
                queue = deque([node])
                old_to_new = {node: Node(val=node.val)}
    
                while queue:
                    cur = queue.popleft()
                    for nei in cur.neighbors:
                        if nei not in old_to_new:
                            old_to_new[nei] = Node(val=nei.val)
                            queue.append(nei)
                        old_to_new[cur].neighbors.append(old_to_new[nei])
    
                return old_to_new[node]
    
        # Recursive DFS solution for the kicks and giggles:
        class Solution:
            def cloneGraph(self, node: 'Node') -> 'Node':
                if not node:
                    return None
                old_to_new = {}
                def dfs(n):
                    if n in old_to_new:
                        return old_to_new[n]
                    copy = Node(n.val)
                    old_to_new[n] = copy
                    for nei in n.neighbors:
                        copy.neighbors.append(dfs(nei))
                    return copy
                return dfs(node)
    

Grid as Graph Modeling #

  • Represent problems on grids as graph problems focusing on connectivity, traversal, or shortest path.

Problems #

  1. Rotting Oranges (994) ;; DONE

    This is a time-simulation of a flood-fill, where the current hop is the time elapsed.

    This is straightforward, we want to do multisource traversal (from each rotten orange) as a way of doing time-simulation and so we just use a simple BFS traversal. This also just keeps an accumulator aux value (fresh) collected via a pre-processing step for our ease.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    
        from collections import deque
        class Solution:
            def orangesRotting(self, grid: List[List[int]]) -> int:
                ROWS, COLS = len(grid), len(grid[0])
                queue = deque()
                fresh = 0
                # accumulate starts points for the BFS start as well as the aux variable, fresh
                for r in range(ROWS):
                    for c in range(COLS):
                        if grid[r][c] == 2:
                            queue.append((r, c, 0))
                        elif grid[r][c] == 1:
                            fresh += 1
                time = 0
                while queue:
                    r, c, t = queue.popleft()
                    for dr, dc in [(-1,0),(1,0),(0,-1),(0,1)]:
                        nr, nc = r+dr, c+dc
                        if 0 <= nr < ROWS and 0 <= nc < COLS and grid[nr][nc] == 1:
                            grid[nr][nc] = 2
                            fresh -= 1
                            queue.append((nr, nc, t+1))
                            time = t+1
                return -1 if fresh > 0 else time
    
  2. Pacific Atlantic Water Flow (417) ;; DONE

    This is a good reminder that traversal direction should be something that we should take a moment to think about before we make a conclusion. Here, we realise that the better approach is to go from outwards to in (instead of the actual flow of the water). This is similar to the point where we should consider simpler, complementary approaches to the problems.

    This is a reverse-reachability type. It’s also a multisource traversal!

    Also remember to mark things as visited ASAP so that we don’t introduce redundancies in the search space.

        from collections import deque
    
        class Solution:
            def pacificAtlantic(self, heights: List[List[int]]) -> List[List[int]]:
                # early returns for trivial edge case:
                if not heights or not heights[0]:
                    return []
    
                ROWS, COLS = len(heights), len(heights[0])
                DIRS = [(-1,0), (1,0), (0,-1), (0,1)]
    
                def bfs(starts):
                    visited = set(starts)
                    queue = deque(starts)
                    while queue:
                        r, c = queue.popleft()
                        # direct for loop is better (easier to read) than generator then for loop:
                        for dr, dc in DIRS:
                            nr, nc = r + dr, c + dc
                            # is reachable coordinate:
                            if (
                                0 <= nr < ROWS and
                                0 <= nc < COLS and
                                (nr, nc) not in visited and
                                heights[nr][nc] >= heights[r][c]
                            ):
                                # mark as visited before enqueing!
                                visited.add((nr, nc))
                                queue.append((nr, nc))
                    return visited
    
                pacific_starts = [(0, c) for c in range(COLS)] + [(r, 0) for r in range(ROWS)]
                atlantic_starts = [(ROWS-1, c) for c in range(COLS)] + [(r, COLS-1) for r in range(ROWS)]
    
                pacific_reach = bfs(pacific_starts)
                atlantic_reach = bfs(atlantic_starts)
    
                return [ [r, c] for r, c in (pacific_reach & atlantic_reach) ]
    

Bipartite Graph Testing and Coloring #

  • Check if a graph can be colored with two colors with no adjacent vertices sharing the same color
  • Critical for identifying bipartite graphs and detecting odd cycles
  • When we want to track many:many relations between two dimensions (e.g. actors and movies )
  • Core algorithms: BFS or DFS based 2-coloring validation

Problems #

  1. Is Graph Bipartite? (785)

    We are given an adjacency matrix and asked to find out if it’s a bipartite graph, so we just colour it and attempt early returns on failure cases. DFS works well here, BFS will work too.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    
        # DFS graph colouring implementation.
        class Solution:
            def isBipartite(self, graph: List[List[int]]) -> bool:
                # reminders: may be disconnected  ==> just attempt to go through the unvisited ones
                n = len(graph)
                visited = [0] * n # value meaning: 0: unvisited, 1: colour A, 2: colour B
    
                # do an iterative DFS for each unvisited:
                for idx in range(n):
                    if visited[idx] != 0: # visited already, carry on:
                        continue
                    stack = [idx]
                    visited[idx] = 1 # default colour?
    
                    while stack:
                        node_idx = stack.pop()
                        node_colour = visited[node_idx]
    
                        for nei in graph[node_idx]:
                            # case 1: same colour, violation:
                            if node_colour == visited[nei]:
                                return False
                            # case 3: uncoloured, start colouring
                            if visited[nei] == 0:
                                visited[nei] = 1 if node_colour == -1 else -1
                                stack.append(nei)
                            # case 2: different , carry on, do nothing
    
                return True
    
        # BFS implementation
        from collections import deque
    
        class Solution:
            def isBipartite(self, graph: List[List[int]]) -> bool:
                n = len(graph)
                color = [0] * n # 1: colour A, -1: colour B, 0: uncoloured
    
                for start in range(n):
                    if color[start] != 0:
                        continue
                    queue = deque([start])
                    color[start] = 1
    
                    while queue:
                        node = queue.popleft()
                        for nei in graph[node]:
                            if color[nei] == color[node]:
                                return False
                            if color[nei] == 0:
                                color[nei] = -color[node]
                                queue.append(nei)
                return True
    
  2. Possible Bipartition (886)

    This is a simpler version of the typical incompatibility graph colouring kind of question. We just colour and be happy about it, if there’s an attempt to colour something that has already been coloured, then we know that it’s not possible to have a bipartition. This is an undirected graph, so we pretend it’s two directed edges per undirected edge.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    
    
        # BFS check for bipartition (via graph coloring)
        from collections import defaultdict, deque
    
        class Solution:
            def possibleBipartition(self, n: int, dislikes: List[List[int]]) -> bool:
                # build adj list, disliker ==> disliked
                adj = defaultdict(list)
                for a, b in dislikes:
                    adj[a].append(b)
                    adj[b].append(a)
    
                # visited nodes, 0: unvisited, 1: colour A, -1: colour B
                visited = [0] * (n + 1) # just makes it easy to handle 1-indexing
    
                for start_idx in range(1, n + 1):
                    if visited[start_idx]:
                        continue
    
                    queue = deque([start_idx])
    
                    # colour start:
                    visited[start_idx] =  1
    
                    while queue:
                        curr = queue.popleft()
                        curr_color = visited[curr]
    
                        # colour neighbours:
                        for nei in adj[curr]:
                            if visited[nei] == curr_color:
                                return False
    
                            if not visited[nei]: # uncoloured
                                visited[nei] = 1 if curr_color == -1 else -1
                                queue.append(nei)
    
                return True
    

SOE #

  1. Concentration errors:

    • Forget to mark nodes as visited causing infinite loops
    • forget to add elements into aux ds-es
  2. TLE issues happen because of unacceptable search spaces:

    • when we don’t mark a cell as visited BEFORE adding it into the aux ds (stack / queue) because we introduce redundant things in the search space.
  3. Previous misconceptions:

    1. Not every DAG is a tree, a DAG is a superset of a tree. Therefore, to do things like tree-validity checks, we can’t do Kahn’s algo topo-sorting and all that.
  4. Comprehension errors from not slowing down and analysing the question properly

    • Not handling return types for trivial vs error cases properly. They might be different (e.g. empty vs impossible grid)

    • It seems that in most questions, we have to slow down and ask the basic questions. E.g. if graph, what is a node, what is an edge. If directed edge, what does a \(\rightarrow\) b mean.

      this is necessary for the correct parsing of our question information. Else we might make silly errors.

      Take the Kahn’s Algo from the “Course Schedule I” as an example.

      As usual, we need to care about the direction of the edge within the DAG. In this, the topo sort a \(\rightarrow\) b means that we do a then we do b. Therefore, if the prereq is given such that a \(\rightarrow\) b meaning that b needs to be done BEFORE a, then the adjacency list we build (which represents the graph) should be b \(\rightarrow\) a since b has to be taken before a is taken.

      the learning here is that as usual, keep asking about the basics “what’s the source, what’s the destination, what is the adj list for (for topo) and what is the indegrees for (for guiding our order of traversal)”

  5. Incorrect calculation of entry/exit indegrees in topo sort

  6. Union-find path compression and rank updates errors

  7. For Topo sorting, remember that the adjacency list is built such that a \(\rightarrow\) b means that we do a first then b. Don’t mistake the directionality of prerequisites form the directionality of our topo sorting / graph represented as an adj list.

  8. Forgetting to do cycle detection on graphs (e.g. when we apply kahn’s for topo sorting or other things) don’t just assume that it’s acyclic.

  9. when adding in 2 directed edges for an undirected graph’s adj list, there’s a need to check if we need to do an explicit parent tracking or not. This matters especially in the case when we do DFSes. Here’s an example of when we do this:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    
        from collections import defaultdict
    
        class Solution:
            def validTree(self, n: int, edges: List[List[int]]) -> bool:
                # early return based on math: there should be n - 1 edges for a tree with n nodes
                if len(edges) != n - 1:
                    return False    # must be exact number of edges
    
                adj = defaultdict(list)
                for a, b in edges:
                    adj[a].append(b)
                    adj[b].append(a)
    
                visited = set()
                stack = [(0, -1)]  # (current, parent)
    
                while stack:
                    node, parent = stack.pop()
                    if node in visited:
                        return False    # cycle detected
                    visited.add(node)
                    for nei in adj[node]:
                        if nei == parent:
                            continue    # don't backtrack
                        stack.append((nei, node))
    
                return len(visited) == n
    
  10. Union find boiler plate’s find implementation can be recursively or iteratively written, don’t mix the two.

    Wrong:

    1
    2
    3
    4
    5
    
        def find(x):
            """Wrong because this mixes both while loop and a recursive call to find()."""
            while parent[x] != x:
                parent[x] = find(parent[x])
            return parent[x]
    

    instead, either of these approaches would work:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    
             def find(self, x: int) -> int:
                 """
                 Find the root of x with path compression.
                 This implementation of the find function is recursive in nature for the path compression.
                 """
                 if self.parent[x] != x:
                     # Path compression: set parent[x] directly to the root
                     self.parent[x] = self.find(self.parent[x])
                 return self.parent[x]
    
             def find_(self, x: int) -> int:
                 """
                 Iterative implementation for the find function with path compression:
                 """
                 while parent[x] != x:
                     parent[x] = parent[parent[x]]
                     x = parent[x]
    
                return x
    

Decision Flow #

  • Need connectivity components? → DFS/BFS or Union-Find
  • Find orderings under dependencies? → Topological sort
  • Find shortest path unweighted? → BFS

Style, Tricks and Boilerplate #

Boilerplate #

Union Find, Disjoint Sets #

  • For parent and rank tracking, we could use a dictionary as well if we might have different ways to track a single element (e.g. a tuple of (row, col) for coordinates in a grid).

  • essential optimisations are the path compression step when implementing the find function and the rank-based “merging”.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    
       class UnionFind:
           def __init__(self, n: int):
               # Initially, each node is its own parent and its rank (size) is 1
               self.parent = [i for i in range(n)]
               self.rank = [1] * n  # can also track size instead of rank (then just put 0)
    
           def find(self, x: int) -> int:
               """
               Find the root of x with path compression.
               This implementation of the find function is recursive in nature for the path compression.
               """
               if self.parent[x] != x:
                   # Path compression: set parent[x] directly to the root
                   self.parent[x] = self.find(self.parent[x])
               return self.parent[x]
    
           def find_(self, x: int) -> int:
               """
               Iterative implementation for the find function with path compression:
               """
               while parent[x] != x:
                   parent[x] = parent[parent[x]]
                   x = parent[x]
    
              return x
    
           def union(self, x: int, y: int) -> bool:
               """
               Union the sets containing x and y.
               Returns True if merged, False if already in the same set.
               """
               rootX = self.find(x)
               rootY = self.find(y)
    
               if rootX == rootY:
                   return False  # already connected
    
               # Union by rank (attach smaller tree to bigger tree)
               if self.rank[rootX] < self.rank[rootY]:
                   self.parent[rootX] = rootY
               elif self.rank[rootX] > self.rank[rootY]:
                   self.parent[rootY] = rootX
               else:
                   self.parent[rootY] = rootX
                   self.rank[rootX] += 1
    
               return True
    

Bipartite checks #

in this example, we treat it as a graph colouring problem with 2 colours and that’s sufficient for us, here’s a DFS approach:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
class Solution:
    def isBipartite(self, graph: List[List[int]]) -> bool:
        """
        This works for undirected graphs and also unconnected graphs because we start DFS from every node possible.
        """
        color = [0] * len(graph)  # Map node i -> odd=1, even=-1, 0 => unvisited
        stack = []

        for i in range(len(graph)):
            if color[i] != 0: # already coloured
                continue
            color[i] = -1 # set initial state, consider it as a source for the DFS

            stack.append(i)
            while stack: # DFS through
                node = stack.pop()
                for nei in graph[node]:
                    if color[node] == color[nei]: # clashing colours
                        return False
                    elif not color[nei]: # uncoloured, time to colour it
                        stack.append(nei)
                        color[nei] = -1 * color[node] # alternative it

        return True

Cycle Detection #

key problem to solve is: do I have a cycle in my graph?

  • Directed Graph Cycle detection

    • DFS with Recursion Stack Approach (graph coloring)

      • To check if there’s a cycle, we need to consider a general cycle (more than just immediate link-back).

      • This means that we’d need some way to encode 3 states for nodes.

        We can encode (colour) them in any way we want, just need 3 states:

        • Unvisited
        • Visiting (currently, in this branch exploration)
        • Visited (when leaving it)
      • During DFS, if you encounter a node marked as Visiting in the current path, a cycle exists.

      Course Schedule (207) offers an opportunity to solve this via colouring:

       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      22
      23
      24
      25
      26
      27
      28
      29
      30
      31
      32
      33
      
       # M1: GRAPH COLORING DFS APPROACH (3-colour graph (0: unvisited: 1: visiting 2: visited))
       class Solution:
           def canFinish(self, numCourses: int, prerequisites: List[List[int]]) -> bool:
               # list of lists, access using relative idx
               adj = [[] for _ in range(numCourses)]
      
               # careful: we are ultimately trying to clear the topo sort, so we are building edge from a -> b if a is a prereq of b
               for course, pre in prerequisites:
                   adj[pre].append(course)
      
               # single visited with 3 states, space optimisation!
               visited = [0] * numCourses  # 0 = unvisited, 1 = visiting, 2 = visited
      
               def dfs(node):
                   if visited[node] == 1:  # Cycle!
                       return False
      
                   if visited[node] == 2: # done state, can return early
                       return True
      
                   visited[node] = 1
                   for nei in adj[node]:
                       if not dfs(nei):
                           return False
      
                   visited[node] = 2
      
                   return True
      
               for i in range(numCourses):
                   if not dfs(i):
                       return False
               return True
      
    • Coloring / Mark and Check Approach

      This is the DFS with Recursion stack, just that the mental model we’re using here is that we’re colouring nodes to signal the 3 states.

      Purely just a framing of the approach here.

    • Kahn’s Algorithm (BFS, topo sorting)

      If not all nodes are processed (cannot remove all nodes with in-degree 0), the remaining nodes form a cycle.

      The key idea here is that a Topo sort is only viable when we have a DAG, that’s why if not everything is processed, there must be a cycle.

      We’ve seen this used in Course Schedule I (207)

       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      22
      23
      24
      25
      26
      27
      28
      29
      30
      
       from collections import defaultdict, deque
      
       class Solution:
           def canFinish(self, numCourses: int, prerequisites: List[List[int]]) -> bool:
               # Build adjacency list and in-degree array
               adj = defaultdict(list)
               # in-degree: For each node, count how many edges point to it = how many prerequisites it has.
               in_degree = [0] * numCourses
      
               for course, pre in prerequisites:
                   adj[pre].append(course)      # prereq → course (edge creation)
                   in_degree[course] += 1       # course needs one more prereq
      
               # 1. Find all courses with no prerequisites. Any element in the queue is "ready to be taken"
               queue = deque(i for i in range(numCourses) if in_degree[i] == 0)
               visited = 0                      # How many courses we've "finished"
      
               # 2. Process all courses "ready" to be taken
               while queue:
                   node = queue.popleft()
                   # count as finished / visited
                   visited += 1
      
                   for neighbor in adj[node]:   # All courses that depend on this one
                       in_degree[neighbor] -= 1 # One prereq is now done!
                       if in_degree[neighbor] == 0: # this neighbor is ready to be taken, we add it into the queue
                           queue.append(neighbor)
      
               # 3. If we managed to "take" all courses, there is no cycle!
               return visited == numCourses
      
  • Undirected Graph Cycle Detection

    • Union-Find Approach Disjoint Set Union

      • Each node starts in its own set.
      • For every edge (to be “added”), check if the two ends have the same parent:
        • If yes, adding the edge would create a cycle.
        • If no, unite their sets (union them).
      • Note: Standard Union-Find does not directly apply to directed graph cycle detection.
    • Back-Edge Checking Using Parent Pointers (DFS, Undirected Graphs)

      • Track the parent node on DFS traversal.
      • If you REVISIT a neighbor that is not the parent of the current node, you have a cycle.

Style #

  1. I like the idea of defining ROWS, COLS, DIRECTIONS at the top and consistently making references to that.
  2. Naming preferences:
    • nei and neis for “neighbour”, “neighbours”
    • careful about not mixing up american and british spellings
  3. Usually time simulations involve a tick-by-tick approach for which BFS is usually a good fit.
  4. when we have to deal with 1-index cases, we can just do dp-like inits with (n + 1) and ignore the 0th index to handle this. That is easier than trying to morph the indices and such.

Tricks #

  1. Traversal direction should be something that we should take a moment to think about. In cases such as the “Pacific Atlantic Water Flow” we realise that the better approach is to go from outwards to in (instead of the actual flow of the water). This is similar to the point where we should consider simpler, complementary approaches to the problems.

    This rings true for ANY graph question really because we should always think about how to represent the problem in the best way (what’s the best representation for nodes and edges…).

  2. TRICK: Memory hack: reuse input grids / spaces instead of spawning new ones.

    we can reuse the input grid for tracking (e.g. in the floodfill approach for “Number of Islands” problem)

  3. TRICK: reduce search space ASAP

    to reduce search spaces, we can ensure that any children added to queues / stacks get marked as visited as we insert into the ds, rather than when we retrieve from the DS.

  4. TRICK: to avoid last-scans for completion, use an aux counter number that we can decrement. IF fully decremented, then the overall state is complete, else it’s an incomplete state.

    e.g. Rotting Oranges problem.

  5. TRICK: intermediate safe states while doing the main job then having a formatting area of the code (like “Surround Regions (130)”)

    We can mark intermediate safe states to help us. Here’s it’s the ‘*’ that we temporarily tag.

    It’s interesting how there’s an intermediate marking state which we then change it back once we have our answers.

  6. we could do topo sorting on a combined supergraph, such as in “Sort Items by Groups Respecting Dependencies (1203)”.

    What we did here is that we created a whole new graph called G with it’s indegree. The nodes within these graph were from groups / items and it didn’t matter because we treated them as the same. This is what steps 3 and 4 are about in the solution below.

    Then we ran the same Kahn’s Algo (step 5) and checked for cycles. Finally we just filtered out the actual nodes (ignoring the groups).

    The supergraphs solution is just tedius, but not impossible to come up with.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    
        from collections import defaultdict, deque
        from typing import List
    
        class Solution:
            def sortItems(self, n: int, m: int, group: List[int],
                          beforeItems: List[List[int]]) -> List[int]:
                """
                Single-graph approach for 1203.
                Instead of 2 separate topo sorts (items, then groups),
                we build one unified DAG containing:
    ​                - virtual group nodes
    ​                - item nodes
                and run ONE topological sort on all of them.
                """
    
                # --- STEP 1: Assign unique group IDs to ungrouped items ---
                # Problem states group[i] == -1 if no group. We give these items
                # their own 'singleton' pseudo-group to enforce adjacency.
                new_group_id = m
                for i in range(n):
                    if group[i] == -1:
                        group[i] = new_group_id
                        new_group_id += 1
                total_groups = new_group_id  # updated count after assignment
    
                # --- STEP 2: Node indexing scheme ---
                # 0..n-1       → item nodes
                # n..n+total_groups-1 → virtual group nodes
                def group_node(g_id: int) -> int:
                    """ Map a group ID to its virtual node index in the graph """
                    return n + g_id
    
                # --- Data structures for graph + indegree ---
                G = defaultdict(list)   # adjacency list
                indeg = defaultdict(int)  # indegree dictionary
    
                # Initialize indegrees to zero for all nodes
                all_nodes = list(range(n)) + [group_node(g) for g in range(total_groups)]
                for node in all_nodes:
                    indeg[node] = 0
    
                # --- STEP 3: Connect each group node to the items it contains ---
                # This ensures items of the same group appear *after* the group's
                # node in topo order, making them contiguous in the final output.
                for item in range(n):
                    gnode = group_node(group[item])
                    G[gnode].append(item)
                    indeg[item] += 1
    
                # --- STEP 4: Add edges for dependencies ---
                # For every beforeItems[i], item i must appear AFTER prev_item if they're from the same group
                # if they're from different groups then the ordering reflects the ordering of their groups.
                for item in range(n):
                    for prev_item in beforeItems[item]:
                        if group[item] == group[prev_item]:
                            # Same group → direct dependency between items
                            G[prev_item].append(item)
                            indeg[item] += 1
                        else:
                            # Different groups → dependency between group nodes
                                G[group_node(group[prev_item])].append(group_node(group[item]))
                           indeg[group_node(group[item])] += 1
    
                # --- STEP 5: Kahn's Algorithm (BFS Topological Sort) ---
                q = deque([node for node in all_nodes if indeg[node] == 0])
                topo_order = []
    
                while q:
                    u = q.popleft()
                    topo_order.append(u)
                    for v in G[u]:
                        indeg[v] -= 1
                        if indeg[v] == 0:
                            q.append(v)
    
                # --- STEP 6: Check for cycles ---
                # A cycle anywhere (items or groups) means no valid order exists.
                if len(topo_order) != len(all_nodes):
                    return []
    
                # --- STEP 7: Extract only items from topo_order ---
                # Group nodes were just helpers to enforce contiguous grouping.
                result = [node for node in topo_order if node < n]
                return result
    

TODO KIV #

TODO questions from existing canonicals #

[ ] Traversal::Teleportation family #

TODO new canonicals #

[ ] Edit distance canonicals #