Topic 11: Graphs

Graph algorithms are largely categorized into traversal, shortest paths, connectivity components, and cycle/dependency detection.

Relations, when well represented as graphs can be exploited to solve some common problems.

Canonical Question Types #

Graph Traversal (DFS & BFS) #

Explore graph components, find paths, count connected or island areas. Traversal could require some basic statistics / counting or something.
- Flood filling is a great mental model to keep in mind for this.
- Keep in mind which direction to traverse, sometimes the easier way is to do a reverse-reachability approach.

Problems #

Number of Islands (200) ;; DONE

It’s a flood fill from all possible islands and trying to reach the others. As a memory hack, just reuse the input for visited.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
    # BFS approach to flood filling
    from collections import deque

    class Solution:
        def numIslands(self, grid: List[List[str]]) -> int:
            ROWS, COLS = len(grid), len(grid[0])
            DIRS = [(-1, 0), (1, 0), (0, 1), (0, -1)]

            if not ROWS or not COLS:
                return 0

            count = 0
            # space trick: reuse input grid for visited tracking
            # helper BFS:
            def bfs(r, c):
                queue = deque([(r,c)])
                grid[r][c] = '0' # mark first as visited

                while queue:
                    row, col = queue.popleft()
                    for dr, dc in DIRS:
                        nr, nc = row + dr, col + dc
                        if is_valid_nei:=(0 <= nr < ROWS and 0 <= nc < COLS and grid[nr][nc] == '1'):
                            grid[nr][nc] = '0' # preemptively mark to reduce search spaces, before adding to queue
                            queue.append((nr, nc))

                return

            for r in range(ROWS):
                for c in range(COLS):
                    if is_island:=(grid[r][c] == '1'):
                        bfs(r, c)
                        count += 1

            return count

    # DFS approach to flood filling: this is a recursive DFS that we've done here
    class Solution:
        def numIslands(self, grid: List[List[str]]) -> int:
            ROWS, COLS = len(grid), len(grid[0])
            def dfs(r, c):
                if not (0 <= r < ROWS and 0 <= c < COLS) or grid[r][c] == '0':
                    return
                grid[r][c] = '0'
                for dr, dc in [ (0,1), (0,-1), (1,0), (-1,0) ]:
                    dfs(r+dr, c+dc)
            count = 0
            for r in range(ROWS):
                for c in range(COLS):
                    if grid[r][c] == '1':
                        count += 1
                        dfs(r, c)
            return count

There’s a few viable alternatives to solving this, such as using a union find approach (note the use of a dict because we wanna track tuples (for the cell coordinate)). For this, note how we only move RIGHT/DOWN for each land cell because we want to process each pair of adjacent land cells (basically each edge) once. By restricting it to only right and down neighbours, each pair is processed exactly once from one side, preventing redundant union operations and improving efficiency.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
    class Solution:
        def numIslands(self, grid: List[List[str]]) -> int:
            rows, cols = len(grid), len(grid[0])
            parent = {}
            rank = {}

            def find(x):
                if parent[x] != x:
                    parent[x] = find(parent[x])  # path compression
                return parent[x]

            def union(x, y):
                rootX, rootY = find(x), find(y)
                if rootX != rootY:
                    if rank[rootX] > rank[rootY]:
                        parent[rootY] = rootX
                    elif rank[rootX] < rank[rootY]:
                        parent[rootX] = rootY
                    else:
                        parent[rootY] = rootX
                        rank[rootX] += 1

            # Initialize parent and rank for land cells
            for r in range(rows):
                for c in range(cols):
                    if grid[r][c] == '1':
                        parent[(r, c)] = (r, c)
                        rank[(r, c)] = 0

            # Union adjacent land cells (right and down to avoid duplicates)
            for r in range(rows):
                for c in range(cols):
                    if grid[r][c] == '1':
                        for dr, dc in [(0, 1), (1, 0)]:
                            nr, nc = r + dr, c + dc
                            if 0 <= nr < rows and 0 <= nc < cols and grid[nr][nc] == '1':
                                union((r, c), (nr, nc))

            # Count distinct roots representing islands
            return len(set(find(x) for x in parent))

Number of Provinces (547)

We can choose to solve this in 2 ways: a union find approach or a bfs-based flood fill approach for connectivity tracking.

Here’s the union find approach:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
   class Solution:
       def findCircleNum(self, isConnected: List[List[int]]) -> int:
           """
           We have to just do union find and get the total number of unique parents / roots.

           isConnected is about direct connections. This means direct edges between them.
           We can get started.
           """
           n = len(isConnected)
           parent = list(range(n))
           rank = [0] * n

           def find(x):
               while parent[x] != x:
                   parent[x] = parent[parent[x]]
                   x = parent[x]
               return x

           def union(x, y):
               """
               Only side-effects matter for us here.
               """
               rx, ry = find(x), find(y)
               if rank[rx] < rank[ry]:
                   parent[rx] = ry
               elif rank[ry] < rank[rx]:
                   parent[ry] = rx
               else:
                   parent[ry] = rx
                   rank[rx] += 1

           for a in range(n):
               for b in range(a + 1, n):
                   if isConnected[a][b]:
                       union(a, b)

           return len({find(x) for x in range(n)})

Here’s a BFS approach, which is more performant:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
   from collections import deque
   class Solution:
       def findCircleNum(self, isConnected: List[List[int]]) -> int:
           """
           We can do a connectivity traversal using a BFS flood fill approach, then we can just count the number of islands  / provinces.
           """
           n = len(isConnected)
           provinces = 0
           visited = [0] * n # 0: univisited, 1: visited

           # now we run the BFS originating from any unvisited:
           for i in range(n):
               if not visited[i]:
                   provinces += 1
                   queue = deque([i])
                   visited[i] = 1
                   while queue:
                       node = queue.popleft()
                       for nei in (nei for nei in range(n) if isConnected[node][nei] and not visited[nei]):
                           queue.append(nei)
                           visited[nei] = 1
           return provinces

Max Area of Island (695) ;; DONE This is just accumulating statistics on a flood fill approach to traversing islands. There’s a bunch of ways we can do this: flood fill (BFS/DFS) or via a union find. BFS is my favourite in general.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
   # M1: FLOOD FILL BFS
    from collections import deque

    class Solution:
        def maxAreaOfIsland(self, grid: List[List[int]]) -> int:
            if not grid or not grid[0]:
                return 0

            area = 0

            ROWS, COLS = len(grid), len(grid[0])
            directions = [(0, 1), (0, -1), (1, 0), (-1, 0)]

            for r in range(ROWS):
                for c in range(COLS):
                    if grid[r][c] == 1: # we start a BFS for more land-cells.
                        curr_area = 0
                        # use as root and carry on:
                        queue = deque([(r, c)])
                        grid[r][c] = 0 # space hack: use the input and mark as visited
                        curr_area += 1
                        while queue:
                            r, c = queue.popleft()
                            valid_children = ((r + dr, c + dc) for dr, dc in directions if 0 <= r + dr < ROWS and 0 <= c + dc < COLS and  grid[r + dr][c + dc] == 1 )

                            for child_row, child_col in valid_children:
                                # update accumulators (curr_area, visited) before adding to the queue
                                grid[child_row][child_col] = 0
                                curr_area += 1

                                # enqueue
                                queue.append((child_row, child_col))

                        area = max(area, curr_area) # helps us find the biggest island in the grid

            return area

    # M2: Flood fill DFS
    class Solution:
        def maxAreaOfIsland(self, grid: List[List[int]]) -> int:
            max_area = 0
            rows, cols = len(grid), len(grid[0])
            for r in range(rows):
                for c in range(cols):
                    if grid[r][c] == 1:
                        stack = [(r, c)]
                        grid[r][c] = 0
                        area = 1 # local area, for this island's measure
                        while stack:
                            cr, cc = stack.pop()
                            for dr, dc in [(-1,0),(1,0),(0,-1),(0,1)]:
                                nr, nc = cr+dr, cc+dc
                                if 0 <= nr < rows and 0 <= nc < cols and grid[nr][nc] == 1:
                                    grid[nr][nc] = 0
                                    stack.append((nr, nc))
                                    area += 1
                        max_area = max(max_area, area)
            return max_area

    # M3: union find but it's complex and ill-fitted for this solution
    class DSU:
        def __init__(self):
            self.parent = {}
            self.size = {} # same concept as rank, we use a dict here because the keys will be coordinate tuples
        def find(self, x):
            if self.parent[x] != x:
                self.parent[x] = self.find(self.parent[x])
            return self.parent[x]
        def union(self, x, y):
            """
            Return value here doesn't matter, we care about the side-effect only.
            """
            xr, yr = self.find(x), self.find(y)
            if xr != yr:
                self.parent[xr] = yr
                self.size[yr] += self.size[xr]
        def add(self, x):
            if x not in self.parent:
                self.parent[x] = x
                self.size[x] = 1

    # Inside solution, union adjacent '1' cells as you read the grid,
    # then return max(dsu.size.values()) as the answer if any, else 0.

Flood Fill (733)

This is pretty straightforward easy question, just need to know the early return case when the source is already coloured and there’s no need to colour anything (including neis)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
    from collections import deque

    class Solution:
        def floodFill(self, image: List[List[int]], sr: int, sc: int, color: int) -> List[List[int]]:
            ROWS, COLS = len(image), len(image[0])
            DIRS = [(0, 1), (0, -1), (-1, 0), (1, 0)]
            start_color = image[sr][sc]
            if start_color == color:
                return image

            queue = deque([(sr, sc)])
            while queue:
                r, c = queue.popleft()

                image[r][c] = color
                for nr, nc in ((r + dr, c + dc) for dr, dc in DIRS if (0 <= (r + dr) < ROWS) and (0 <= (c + dc) < COLS)):
                    if image[nr][nc] != start_color:
                        continue
                    image[nr][nc] = color
                    queue.append((nr, nc))
            return image

Walls and Gates (??) (neetcode version) ;; DONE

This is just distance tracking, we can go layer by layer and get things done. The aux ds (queue / stack) can hold tuples to keep track of parent height. Oh and it’s a multi-source BFS that happens at the same time.

This is a case of multi-source BFS really. Nothing special here, we just need to write it in-place and have nothing to return.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
    from collections import deque

    class Solution:
        def islandsAndTreasure(self, grid: List[List[int]]) -> None:
            if not grid:
                return

            queue = deque()
            ROWS, COLS = len(grid), len(grid[0])
            DIRS = [(0, -1), (0, 1), (1, 0), (-1, 0)]
            INF = 2147483647

            # 1: add all gates to queue at distance 0 (multiple source BFS, simultaneously run them)
            for r, c in ((r, c) for r in range(ROWS) for c in range(COLS) if grid[r][c] == 0):
                # is treasure:
                queue.append((r, c))

            while queue:
                r, c = queue.popleft()
                candidates = ((r + dr, c + dc) for dr, dc in DIRS if (0 <= r + dr < ROWS) and (0 <= c + dc < COLS) and grid[r + dr][c + dc] == INF)
                for row, col in candidates:
                    grid[row][col] = grid[r][c] + 1
                    queue.append((row, col))

Extension: if the cost of traversing an edge from node to its neighbors had a costs (positive) then we would have to consider doing some kinda of Dijkstra’s approach. In this case a simple BFS works perfectly.

Rotting Oranges (994) ;; DONE

This is a time-simulation of a flood-fill, where the current hop is the time elapsed.

This is straightforward, we want to do multisource traversal (from each rotten orange) as a way of doing time-simulation and so we just use a simple BFS traversal. This also just keeps an accumulator aux value (fresh) collected via a pre-processing step for our ease.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
    from collections import deque
    class Solution:
        def orangesRotting(self, grid: List[List[int]]) -> int:
            ROWS, COLS = len(grid), len(grid[0])
            queue = deque()
            fresh = 0
            # accumulate starts points for the BFS start as well as the aux variable, fresh
            for r in range(ROWS):
                for c in range(COLS):
                    if grid[r][c] == 2:
                        queue.append((r, c, 0))
                    elif grid[r][c] == 1:
                        fresh += 1
            time = 0
            while queue:
                r, c, t = queue.popleft()
                for dr, dc in [(-1,0),(1,0),(0,-1),(0,1)]:
                    nr, nc = r+dr, c+dc
                    if 0 <= nr < ROWS and 0 <= nc < COLS and grid[nr][nc] == 1:
                        grid[nr][nc] = 2
                        fresh -= 1
                        queue.append((nr, nc, t+1))
                        time = t+1
            return -1 if fresh > 0 else time

Pacific Atlantic Water Flow (417) ;; DONE

This is a good reminder that traversal direction should be something that we should take a moment to think about before we make a conclusion. Here, we realise that the better approach is to go from outwards to in (instead of the actual flow of the water). This is similar to the point where we should consider simpler, complementary approaches to the problems.

This is a reverse-reachability type. It’s also a multisource traversal!

Also remember to mark things as visited ASAP so that we don’t introduce redundancies in the search space.

    from collections import deque

    class Solution:
        def pacificAtlantic(self, heights: List[List[int]]) -> List[List[int]]:
            # early returns for trivial edge case:
            if not heights or not heights[0]:
                return []

            ROWS, COLS = len(heights), len(heights[0])
            DIRS = [(-1,0), (1,0), (0,-1), (0,1)]

            def bfs(starts):
                visited = set(starts)
                queue = deque(starts)
                while queue:
                    r, c = queue.popleft()
                    # direct for loop is better (easier to read) than generator then for loop:
                    for dr, dc in DIRS:
                        nr, nc = r + dr, c + dc
                        # is reachable coordinate:
                        if (
                            0 <= nr < ROWS and
                            0 <= nc < COLS and
                            (nr, nc) not in visited and
                            heights[nr][nc] >= heights[r][c]
                        ):
                            # mark as visited before enqueing!
                            visited.add((nr, nc))
                            queue.append((nr, nc))
                return visited

            pacific_starts = [(0, c) for c in range(COLS)] + [(r, 0) for r in range(ROWS)]
            atlantic_starts = [(ROWS-1, c) for c in range(COLS)] + [(r, COLS-1) for r in range(ROWS)]

            pacific_reach = bfs(pacific_starts)
            atlantic_reach = bfs(atlantic_starts)

            return [ [r, c] for r, c in (pacific_reach & atlantic_reach) ]

Surrounded Regions (130) ;; DONE

This one is about creating temporary marks as we navigate then restoring thing as part of a formatting step.

This is also a reverse-reachability type of question because we actually find the inverse (safe cells) that we can reach from the border cells since these safe cells won’t be converted. This will help us find the actual unsafe cells that we will convert.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
    from collections import deque
    from typing import List

    class Solution:
        def solve(self, board: List[List[str]]) -> None:
            if not board or not board[0]:
                return

            ROWS, COLS = len(board), len(board[0])
            DIRECTIONS = [(-1, 0), (1, 0), (0, -1), (0, 1)]

            def get_border_o_positions():
                """
                Returns a set of coordinates for all the Os that are at the borders
                """
                # All 'O's on the border (first/last row or column)
                top_bottom = ((row, col) for col in range(COLS) for row in (0, ROWS-1) if board[row][col] == 'O')
                left_right = ((row, col) for row in range(ROWS) for col in (0, COLS-1) if board[row][col] == 'O')
                return set(top_bottom) | set(left_right)

            def mark_border_connected_region_as_safe():
                border_o_positions = get_border_o_positions()
                queue = deque()
                for r, c in border_o_positions:
                    # mark as visited before enqueing the start coordinates:
                    board[r][c] = '*'
                    queue.append((r, c))
                while queue:
                    r, c = queue.popleft()
                    for dr, dc in DIRECTIONS:
                        nr, nc = r + dr, c + dc
                        if (0 <= nr < ROWS and 0 <= nc < COLS and board[nr][nc] == 'O'):
                            board[nr][nc] = '*'
                            queue.append((nr, nc))

            mark_border_connected_region_as_safe()

            # List comprehensions for postprocessing
            for r, c in ((r, c) for r in range(ROWS) for c in range(COLS)):
                if board[r][c] == 'O':
                    board[r][c] = 'X'
                elif board[r][c] == '*':
                    board[r][c] = 'O'

Minimum Jumps to Reach End via Prime Teleportation (3629)

Main idea is that we need to do a bunch of preprocessing to make the BFS efficient. The idea therefore is to do both prime sieving AND keeping track of prime factors.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
    from math import sqrt

    class Solution:
        def minJumps(self, nums: List[int]) -> int:
            n = len(nums)
            max_val = max(nums)

            def sieve(limit):
                """
                Returns a list of prime numbers within limit (inclusive)
                """
                is_prime = [True] * (limit + 1) # effectively makes this 1-indexed
                is_prime[0], is_prime[1] = False, False
                # remember we only need to do till sqrt of the limit
                for num in range(2, int(sqrt(limit) + 1)):
                    if is_prime[num]:
                        # negate its multiples:
                        for j in range(num * num, limit + 1, num):
                            is_prime[j] = False

                return is_prime

            is_prime = sieve(max_val) # is the reference prime-sieve

            def prime_factors(num):
                """
                Returns a set of prime factors for a number:
                """
                factors = set()
                temp = num
                # prime factorisation
                for i in range(2, int(sqrt(num)) + 1):
                    if not is_prime[i]:
                        continue

                    while temp % i == 0:
                        factors.add(i)
                        temp //= i

                    if temp == 1:
                        break

                if temp > 1: # then temp is a prime too
                    factors.add(temp)

                return factors

            # creates a map from prime factor to list of indices in nums with number divisible by that prime, these are the teleport indices
            prime_to_indices = defaultdict(list)
            for idx, num in enumerate(nums):
                factors = prime_factors(num)
                for f in factors:
                    prime_to_indices[f].append(idx)


            # now we do the BFS with hop counting:
            visited = [False] * n
            visited[0] = True
            visited_prime = set() # track prime whose teleportations are visited
            q = deque([(0, 0)]) # (idx, hops)
            while q:
                idx, hops = q.popleft()

                # reached the end:
                if idx == n - 1:
                    return hops

                # adjacent moves:
                for nei in [idx - 1, idx + 1]:
                    if 0 <= nei < n and not visited[nei]:
                        visited[nei] = True
                        q.append((nei, hops + 1))

                # teleports:
                num = nums[idx]
                if is_prime[num] and num not in visited_prime:
                    visited_prime.add(num)
                    for nei in prime_to_indices[num]:
                        if not visited[nei]:
                            visited[nei] = True
                            q.append((nei, hops + 1))

            return -1

The main part is that if we have the possibilities lookup via something like this, then we can use it for our BFS and then the least hop reach to the final cell is our answer that we seek.

1
2
3
4
5
6
            # creates a map from prime factor to list of indices in nums with number divisible by that prime, these are the teleport indices
            prime_to_indices = defaultdict(list)
            for idx, num in enumerate(nums):
                factors = prime_factors(num)
                for f in factors:
                    prime_to_indices[f].append(idx)

Find if Path Exists in Graph (1971)

ended up being a basic traversal to check for reachability from source to destination, given edges.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
    from collections import defaultdict, deque
    class Solution:
        def validPath(self, n: int, edges: List[List[int]], source: int, destination: int) -> bool:
            """
            We could directly do a graph traversal.
            """
            adj = defaultdict(list)
            for u, v in edges:
                adj[u].append(v)
                adj[v].append(u)

            visited = set()
            q = deque([source])
            while q:
                node = q.popleft()
                if node == destination:
                    return True
                for nei in adj[node]:
                    if nei not in visited:
                        visited.add(nei)
                        q.append(nei)

            return False

Shortest Path (Unweighted Graph) #

BFS-based shortest path finding in unweighted graphs (fixed cost). If there were costs involved then we could explore a Bellman-Ford / Dijkstra’s approach.

Problems #

Word Ladder (127)

This is a shortest path problem that feels like an edit distance kind, but it’s a graph traversal. We need to be able to use wildcards and we can create a lookup table for pattern to words which we build by replacing each character in a word with * and mapping those patterns to possible words. Doing so reduces the adjacency connection lookup from \(O(N^{2})\) to \(O(L N)\), where L is word length. Also take note that we only want to reach each node (pattern) once in the overall procedure, so we silence it after we reach it.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
    from collections import defaultdict, deque

    class Solution:
        def ladderLength(self, beginWord: str, endWord: str, wordList: List[str]) -> int:
            if endWord not in wordList:
                return 0

            L = len(beginWord)

            # better mapping:
            pattern_to_words = defaultdict(list)
            for word in wordList:
                for i in range(L):
                    pattern = word[:i] + "*" + word[i + 1:]
                    pattern_to_words[pattern].append(word)

            # now we run BFS:
            queue = deque([(beginWord, 1)]) # word, level
            visited = {beginWord}
            while queue:
                word, level = queue.popleft()
                for i in range(L):
                    pattern = word[:i] + "*" + word[i + 1:]
                    for nei in pattern_to_words[pattern]:
                        if nei == endWord:
                            return level + 1
                        if nei not in visited:
                            visited.add(nei)
                            queue.append((nei, level + 1))

                    # NB: prevents repeat lookups, we silence that pattern check so that each pattern is used only once per search.
                    pattern_to_words[pattern] = []

            return 0

In some ways, this question involves a graph expansion / modification as well. This is implicit in our use of pattern_to_words.

Topological Sort & Cycle Detection #

Detect cycles in directed or undirected graphs and generate valid node orderings in DAGs

Problems #

Course Schedule (207) ;; DONE

This is a simpler version of the problem that just expects a boolean return on whether we can do a topo sort (i.e. whether it’s an acyclic DAG). There’s 2 ways to do this: graph colouring (to ensure no cycles) or just straight up doing kahn’s algo:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
    # M1: GRAPH COLORING DFS APPROACH (3-colour graph (0: unvisited: 1: visiting 2: visited))
    class Solution:
        def canFinish(self, numCourses: int, prerequisites: List[List[int]]) -> bool:
            # list of lists, access using relative idx
            adj = [[] for _ in range(numCourses)]

            # careful: we are ultimately trying to clear the topo sort, so we are building edge from a -> b if a is a prereq of b
            for course, pre in prerequisites:
                adj[pre].append(course)

            # single visited with 3 states, space optimisation!
            visited = [0] * numCourses  # 0 = unvisited, 1 = visiting, 2 = visited

            def dfs(node):
                if visited[node] == 1:  # Cycle!
                    return False

                if visited[node] == 2: # done state, can return early
                    return True

                visited[node] = 1
                for nei in adj[node]:
                    if not dfs(nei):
                        return False

                visited[node] = 2

                return True

            for i in range(numCourses):
                if not dfs(i):
                    return False
            return True

    # M2: Kahn's algo used to do cycle-checks:
    from collections import defaultdict, deque

    class Solution:
        def canFinish(self, numCourses: int, prerequisites: List[List[int]]) -> bool:
            # Build adjacency list and in-degree array
            adj = defaultdict(list)
            # in-degree: For each node, count how many edges point to it = how many prerequisites it has.
            in_degree = [0] * numCourses

            for course, pre in prerequisites:
                adj[pre].append(course)      # prereq → course (edge creation)
                in_degree[course] += 1       # course needs one more prereq

            # 1. Find all courses with no prerequisites. Any element in the queue is "ready to be taken"
            queue = deque(i for i in range(numCourses) if in_degree[i] == 0)
            visited = 0                      # How many courses we've "finished"

            # 2. Process all courses "ready" to be taken
            while queue:
                node = queue.popleft()
                # count as finished / visited
                visited += 1

                for neighbor in adj[node]:   # All courses that depend on this one
                    in_degree[neighbor] -= 1 # One prereq is now done!
                    if in_degree[neighbor] == 0: # this neighbor is not ready to be taken
                        queue.append(neighbor)

            # 3. If we managed to "take" all courses, there is no cycle!
            return visited == numCourses

Course Schedule II (210) ;; DONE

This is a natural extension where we need to generate the actual topologically sorted array of nodes. Similarly, we can use Kahn’s Algo or do a DFS-based graph colouring for our purpose (just do a post-order inclusion). For the DFS approach, remember that the way we choose to mark visited will end up affecting whether or not we might need to reverse the schedule. I just prefer that Kahn’s Algo.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
    # M1: Using Kahn's Algo
    from collections import defaultdict
    class Solution:
        def findOrder(self, numCourses: int, prerequisites: List[List[int]]) -> List[int]:
            adj = defaultdict(list)

            indegree = [0] * numCourses

            # calculate the indegrees:
            for course, pre in prerequisites:
                adj[pre].append(course)
                indegree[course]+= 1

            # find no prereqs:
            # ready-queue
            queue = deque([course for course in range(numCourses) if indegree[course] == 0])

            schedule = [] # NB: we can use the length of this as "num visited" to check for cycle, no aux counter variable necessary.

            while queue:
                node = queue.popleft()
                schedule.append(node)

                for neighbor in adj[node]:
                    indegree[neighbor] -= 1
                    # neighbour ready to be taken:
                    if indegree[neighbor] == 0:
                        queue.append(neighbor)

            # if cycle found:
            if not len(schedule) == numCourses:
                return []

            return schedule

    # M2: here's the DFS graph colouring approach:
    class Solution:
        def findOrder(self, numCourses, prerequisites):
            # Preprep:
            # NOTE: maps prereq -> course
            adj = [[] for _ in range(numCourses)]
            for course, pre in prerequisites:
                adj[pre].append(course)
            state = [0] * numCourses  # 0=unvisited, 1=visiting, 2=visited
            order = []
            def dfs(node):
                # detected cycle in the same round!
                if state[node] == 1:
                    return False
                if state[node] == 2:
                    return True

                # mark new as visiting:
                state[node] = 1
                for nei in adj[node]:
                    # explore depth in neighbours:
                    if not dfs(nei):
                        return False
                # finally, mark as visited -- this node is ready to be taken
                state[node] = 2
                # mark this course as taken
                order.append(node)

                return True

            for i in range(numCourses):
                # if unvisited:
                if state[i] == 0:
                    if not dfs(i):
                        return []

            return order[::-1] # remember to reverse because it's a postorder accumulation that we have done here.

Graph Valid Tree (??) (neetcode link) ;; DONE (connected + no cycle)]

This is a traversal on an undirected graph to check if it’s a tree. We can use the basic math properties for early returns (n - 1 edges in a tree for a tree of n nodes). In this case, we also handle the undirected nature by splitting it into two directed edges (we also need to do the parent tracking for this!). This is really just a cycle check.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
    from collections import defaultdict

    class Solution:
        def validTree(self, n: int, edges: List[List[int]]) -> bool:
            # early return based on math: there should be n - 1 edges for a tree with n nodes
            if len(edges) != n - 1:
                return False    # must be exact number of edges

            adj = defaultdict(list)
            for a, b in edges:
                adj[a].append(b)
                adj[b].append(a)

            visited = set()
            stack = [(0, -1)]  # (current, parent)

            while stack:
                node, parent = stack.pop()
                if node in visited:
                    return False    # cycle detected
                visited.add(node)
                for nei in adj[node]:
                    if nei == parent:
                        continue    # don't backtrack
                    stack.append((nei, node))

            return len(visited) == n

Sort Items by Groups Respecting Dependencies (1203)

This is a higher order question whereby we need to topo sort by 2 dimensions: topo sorting within items, topo sorting of groups. This gives us a 2-phased approach.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
    from collections import defaultdict, deque
    from typing import List

    class Solution:
        def sortItems(self, n: int, m: int, group: List[int], beforeItems: List[List[int]]) -> List[int]:
            # Step 0: Assign unique group IDs to ungrouped items, keep them in unique groups
            new_gid = m
            for i in range(n):
                if group[i] == -1:
                    group[i] = new_gid
                    new_gid += 1

            # Step 1: Prepare adjacency lists + indegrees
            group_adj, group_indeg = defaultdict(list), defaultdict(int)
            item_adj, item_indeg = defaultdict(list), defaultdict(int)

            # Step 2: Build both graphs
            for item in range(n):
                for prev_item in beforeItems[item]:
                    # Build item dependency graph
                    item_adj[prev_item].append(item)
                    item_indeg[item] += 1

                    # Build group dependency graph, only if they belong to different groups
                    if group[prev_item] != group[item]:
                        src, dest = group[prev_item], group[item]
                        group_adj[src].append(dest)
                        group_indeg[dest] += 1

            # Step 3: Topological sort helper
            def topo(nodes, adj, indeg):
                q = deque(x for x in nodes if indeg[x] == 0)
                res = []
                while q:
                    u = q.popleft()
                    res.append(u)
                    for nei in adj[u]:
                        indeg[nei] -= 1
                        if indeg[nei] == 0:
                            q.append(nei)
                return res if len(res) == len(nodes) else []

            # Step 4: Get group order & item order
            group_nodes = list(range(new_gid))
            group_order = topo(group_nodes, group_adj, group_indeg.copy()) # GOTCHA: remember to copy over else the reference indegree may be mutated
            if not group_order:
                return []

            item_nodes = list(range(n))
            items_order = topo(item_nodes, item_adj, item_indeg.copy())
            if not items_order:
                return []

            # Step 5: Arrange items by group according to items_order
            group_to_items = defaultdict(list)
            for item in items_order:
                group_to_items[group[item]].append(item)

            # Step 6: Build the final result
            res = []
            for g in group_order:
                res.extend(group_to_items[g])
            return res

Another big-brain approach is to use an augmented graph for this (so we have a supergraph) but this hurts my head.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
    from collections import defaultdict, deque
    from typing import List

    class Solution:
        def sortItems(self, n: int, m: int, group: List[int],
                      beforeItems: List[List[int]]) -> List[int]:
            """
            Single-graph approach for 1203.
            Instead of 2 separate topo sorts (items, then groups),
            we build one unified DAG containing:
​                - virtual group nodes
​                - item nodes
            and run ONE topological sort on all of them.
            """

            # --- STEP 1: Assign unique group IDs to ungrouped items ---
            # Problem states group[i] == -1 if no group. We give these items
            # their own 'singleton' pseudo-group to enforce adjacency.
            new_group_id = m
            for i in range(n):
                if group[i] == -1:
                    group[i] = new_group_id
                    new_group_id += 1
            total_groups = new_group_id  # updated count after assignment

            # --- STEP 2: Node indexing scheme ---
            # 0..n-1       → item nodes
            # n..n+total_groups-1 → virtual group nodes
            def group_node(g_id: int) -> int:
                """ Map a group ID to its virtual node index in the graph """
                return n + g_id

            # --- Data structures for graph + indegree ---
            G = defaultdict(list)   # adjacency list
            indeg = defaultdict(int)  # indegree dictionary

            # Initialize indegrees to zero for all nodes
            all_nodes = list(range(n)) + [group_node(g) for g in range(total_groups)]
            for node in all_nodes:
                indeg[node] = 0

            # --- STEP 3: Connect each group node to the items it contains ---
            # This ensures items of the same group appear *after* the group's
            # node in topo order, making them contiguous in the final output.
            for item in range(n):
                gnode = group_node(group[item])
                G[gnode].append(item)
                indeg[item] += 1

            # --- STEP 4: Add edges for dependencies ---
            # For every beforeItems[i], item i must appear AFTER prev_item.
            for item in range(n):
                for prev_item in beforeItems[item]:
                    if group[item] == group[prev_item]:
                        # Same group → direct dependency between items
                        G[prev_item].append(item)
                        indeg[item] += 1
                    else:
                        # Different groups → dependency between group nodes
                        G[group_node(group[prev_item])].append(group_node(group[item]))
                        indeg[group_node(group[item])] += 1

            # --- STEP 5: Kahn's Algorithm (BFS Topological Sort) ---
            q = deque([node for node in all_nodes if indeg[node] == 0])
            topo_order = []

            while q:
                u = q.popleft()
                topo_order.append(u)
                for v in G[u]:
                    indeg[v] -= 1
                    if indeg[v] == 0:
                        q.append(v)

            # --- STEP 6: Check for cycles ---
            # A cycle anywhere (items or groups) means no valid order exists.
            if len(topo_order) != len(all_nodes):
                return []

            # --- STEP 7: Extract only items from topo_order ---
            # Group nodes were just helpers to enforce contiguous grouping.
            result = [node for node in topo_order if node < n]
            return result

Union-Find Disjoint Sets #

Manage dynamic connectivity, detect cycles, count components efficiently

Problems #

Number of Connected Components in an Undirected Graph (323) ;; DONE

This is traditionally a union find question but we can also use a DFS approach for this.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
    # M1: Ideal, Union Find approach
    class Solution:
        def countComponents(self, n: int, edges: List[List[int]]) -> int:
            parent = list(range(n))
            rank = [0] * len(n) # should init to 0

            def find(x):
                while parent[x] != x:
                    parent[x] = parent[parent[x]]  # path compression
                    x = parent[x]
                return x

            def union(x, y):
                rootX, rootY = find(x), find(y)
                if rank[rootX] < rank[rootY]:
                    parent[rootX] = rootY
                    return
                if rank[rootY] < rank[rootX]:
                    parent[rootY] = rootX
                    return
                parent[rootX] = rootY
                rank[rootY] += 1

            for a, b in edges:
                union(a, b)

            return len({find(x) for x in range(n)})

    # M2: BFS approach:
    from collections import defaultdict, deque

    class Solution:
        def countComponents(self, n: int, edges: List[List[int]]) -> int:
            adj = defaultdict(list)
            for a, b in edges:
                adj[a].append(b)
                adj[b].append(a)

            visited = [False] * n

            def bfs(start):
                queue = deque([start])
                # FIX 1 : remember to mark the first entry node as visited
                visited[start] = True
                while queue:
                    node = queue.popleft()
                    for nei in adj[node]:
                        if not visited[nei]:
                            visited[nei] = True
                            queue.append(nei)

            comps = 0
            for node in range(n):
                if not visited[node]:
                    comps += 1
                    bfs(node)
                    return comps

I think it’s alright to put the bfs as a helper or within the function directly, either way works or default to what looks prettier.

Redundant Connection (684) ;; DONE

We traverse through the edges (since these are the connections) and try to union the nodes together. If they’re already joined then that edge we are currently entertaining is an extra edge.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
    def findRedundantConnection(edges):
        n = len(edges)
        parent = [i for i in range(n+1)]

        def find(x):
            while parent[x] != x:
                # set grandparent as parent
                parent[x] = parent[parent[x]]  # path compression
                # check grandparent
                x = parent[x]

            return x

        def union(x, y):
            rootX, rootY = find(x), find(y)

            if rootX == rootY:
                return False  # x and y are already connected

            parent[rootY] = rootX
            return True

        for u, v in edges:
            if not union(u, v):
                return [u, v]

This version is an improvement that considers ranking:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
   class Solution:
       def findRedundantConnection(self, edges: List[List[int]]) -> List[int]:
           # we do a union find
           n = len(edges)
           parent = list(range(n + 1)) # since the nodes are 1-indexed
           rank = [0] * (n + 1)

           def find(u):
               # compress parent
               while parent[u] != u:
                   # path compress to its own parent
                   parent[u] = parent[parent[u]]
                   u = parent[u]

               return u

           def union(x, y):
               rx, ry = find(x), find(y)
               if rx == ry:
                   return False # have been connected before, this is redundant

               if rank[rx] < rank[ry]:
                   parent[rx] = ry
               elif rank[ry] < rank[rx]:
                   parent[ry] = rx
               else:
                   parent[ry] = rx
                   rank[rx] += 1

               return True

           for u,v in edges:
               if not union(u, v):
                   return [u, v]

Number of Provinces (547)

We can choose to solve this in 2 ways: a union find approach or a bfs-based flood fill approach for connectivity tracking.

Here’s the union find approach:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
   class Solution:
       def findCircleNum(self, isConnected: List[List[int]]) -> int:
           """
           We have to just do union find and get the total number of unique parents / roots.

           isConnected is about direct connections. This means direct edges between them.
           We can get started.
           """
           n = len(isConnected)
           parent = list(range(n))
           rank = [0] * n

           def find(x):
               while parent[x] != x:
                   parent[x] = parent[parent[x]]
                   x = parent[x]
               return x

           def union(x, y):
               """
               Only side-effects matter for us here.
               """
               rx, ry = find(x), find(y)
               if rank[rx] < rank[ry]:
                   parent[rx] = ry
               elif rank[ry] < rank[rx]:
                   parent[ry] = rx
               else:
                   parent[ry] = rx
                   rank[rx] += 1

           for a in range(n):
               for b in range(a + 1, n):
                   if isConnected[a][b]:
                       union(a, b)

           return len({find(x) for x in range(n)})

Here’s a BFS approach, which is more performant:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
   from collections import deque
   class Solution:
       def findCircleNum(self, isConnected: List[List[int]]) -> int:
           """
           We can do a connectivity traversal using a BFS flood fill approach, then we can just count the number of islands  / provinces.
           """
           n = len(isConnected)
           provinces = 0
           visited = [0] * n # 0: univisited, 1: visited

           # now we run the BFS originating from any unvisited:
           for i in range(n):
               if not visited[i]:
                   provinces += 1
                   queue = deque([i])
                   visited[i] = 1
                   while queue:
                       node = queue.popleft()
                       for nei in (nei for nei in range(n) if isConnected[node][nei] and not visited[nei]):
                           queue.append(nei)
                           visited[nei] = 1
           return provinces

Graph Construction & Cloning #

Build or reconstruct graph structure from input data (edge lists, matrices)

Problems #

Clone Graph (133)

This is just a traversal (any kind will do) and keep track of a mapping between old and new nodes, that way we build a whole new graph with the correct node values. Also remember that this cloning only pertains to the nodes, the edges are just a happy side-effect.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
    from collections import deque
    from typing import Optional

    class Solution:
        def cloneGraph(self, node: Optional['Node']) -> Optional['Node']:
            if not node:
                return None

            queue = deque([node])
            old_to_new = {node: Node(val=node.val)}

            while queue:
                cur = queue.popleft()
                for nei in cur.neighbors:
                    if nei not in old_to_new:
                        old_to_new[nei] = Node(val=nei.val)
                        queue.append(nei)
                    old_to_new[cur].neighbors.append(old_to_new[nei])

            return old_to_new[node]

    # Recursive DFS solution for the kicks and giggles:
    class Solution:
        def cloneGraph(self, node: 'Node') -> 'Node':
            if not node:
                return None
            old_to_new = {}
            def dfs(n):
                if n in old_to_new:
                    return old_to_new[n]
                copy = Node(n.val)
                old_to_new[n] = copy
                for nei in n.neighbors:
                    copy.neighbors.append(dfs(nei))
                return copy
            return dfs(node)

Grid as Graph Modeling #

Represent problems on grids as graph problems focusing on connectivity, traversal, or shortest path.

Problems #

Rotting Oranges (994) ;; DONE

This is a time-simulation of a flood-fill, where the current hop is the time elapsed.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
    from collections import deque
    class Solution:
        def orangesRotting(self, grid: List[List[int]]) -> int:
            ROWS, COLS = len(grid), len(grid[0])
            queue = deque()
            fresh = 0
            # accumulate starts points for the BFS start as well as the aux variable, fresh
            for r in range(ROWS):
                for c in range(COLS):
                    if grid[r][c] == 2:
                        queue.append((r, c, 0))
                    elif grid[r][c] == 1:
                        fresh += 1
            time = 0
            while queue:
                r, c, t = queue.popleft()
                for dr, dc in [(-1,0),(1,0),(0,-1),(0,1)]:
                    nr, nc = r+dr, c+dc
                    if 0 <= nr < ROWS and 0 <= nc < COLS and grid[nr][nc] == 1:
                        grid[nr][nc] = 2
                        fresh -= 1
                        queue.append((nr, nc, t+1))
                        time = t+1
            return -1 if fresh > 0 else time

Pacific Atlantic Water Flow (417) ;; DONE

This is a reverse-reachability type. It’s also a multisource traversal!

Also remember to mark things as visited ASAP so that we don’t introduce redundancies in the search space.

    from collections import deque

    class Solution:
        def pacificAtlantic(self, heights: List[List[int]]) -> List[List[int]]:
            # early returns for trivial edge case:
            if not heights or not heights[0]:
                return []

            ROWS, COLS = len(heights), len(heights[0])
            DIRS = [(-1,0), (1,0), (0,-1), (0,1)]

            def bfs(starts):
                visited = set(starts)
                queue = deque(starts)
                while queue:
                    r, c = queue.popleft()
                    # direct for loop is better (easier to read) than generator then for loop:
                    for dr, dc in DIRS:
                        nr, nc = r + dr, c + dc
                        # is reachable coordinate:
                        if (
                            0 <= nr < ROWS and
                            0 <= nc < COLS and
                            (nr, nc) not in visited and
                            heights[nr][nc] >= heights[r][c]
                        ):
                            # mark as visited before enqueing!
                            visited.add((nr, nc))
                            queue.append((nr, nc))
                return visited

            pacific_starts = [(0, c) for c in range(COLS)] + [(r, 0) for r in range(ROWS)]
            atlantic_starts = [(ROWS-1, c) for c in range(COLS)] + [(r, COLS-1) for r in range(ROWS)]

            pacific_reach = bfs(pacific_starts)
            atlantic_reach = bfs(atlantic_starts)

            return [ [r, c] for r, c in (pacific_reach & atlantic_reach) ]

Bipartite Graph Testing and Coloring #

Check if a graph can be colored with two colors with no adjacent vertices sharing the same color
Critical for identifying bipartite graphs and detecting odd cycles
When we want to track many:many relations between two dimensions (e.g. actors and movies )
Core algorithms: BFS or DFS based 2-coloring validation

Problems #

Is Graph Bipartite? (785)

We are given an adjacency matrix and asked to find out if it’s a bipartite graph, so we just colour it and attempt early returns on failure cases. DFS works well here, BFS will work too.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
    # DFS graph colouring implementation.
    class Solution:
        def isBipartite(self, graph: List[List[int]]) -> bool:
            # reminders: may be disconnected  ==> just attempt to go through the unvisited ones
            n = len(graph)
            visited = [0] * n # value meaning: 0: unvisited, 1: colour A, 2: colour B

            # do an iterative DFS for each unvisited:
            for idx in range(n):
                if visited[idx] != 0: # visited already, carry on:
                    continue
                stack = [idx]
                visited[idx] = 1 # default colour?

                while stack:
                    node_idx = stack.pop()
                    node_colour = visited[node_idx]

                    for nei in graph[node_idx]:
                        # case 1: same colour, violation:
                        if node_colour == visited[nei]:
                            return False
                        # case 3: uncoloured, start colouring
                        if visited[nei] == 0:
                            visited[nei] = 1 if node_colour == -1 else -1
                            stack.append(nei)
                        # case 2: different , carry on, do nothing

            return True

    # BFS implementation
    from collections import deque

    class Solution:
        def isBipartite(self, graph: List[List[int]]) -> bool:
            n = len(graph)
            color = [0] * n # 1: colour A, -1: colour B, 0: uncoloured

            for start in range(n):
                if color[start] != 0:
                    continue
                queue = deque([start])
                color[start] = 1

                while queue:
                    node = queue.popleft()
                    for nei in graph[node]:
                        if color[nei] == color[node]:
                            return False
                        if color[nei] == 0:
                            color[nei] = -color[node]
                            queue.append(nei)
            return True

Possible Bipartition (886)

This is a simpler version of the typical incompatibility graph colouring kind of question. We just colour and be happy about it, if there’s an attempt to colour something that has already been coloured, then we know that it’s not possible to have a bipartition. This is an undirected graph, so we pretend it’s two directed edges per undirected edge.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38

    # BFS check for bipartition (via graph coloring)
    from collections import defaultdict, deque

    class Solution:
        def possibleBipartition(self, n: int, dislikes: List[List[int]]) -> bool:
            # build adj list, disliker ==> disliked
            adj = defaultdict(list)
            for a, b in dislikes:
                adj[a].append(b)
                adj[b].append(a)

            # visited nodes, 0: unvisited, 1: colour A, -1: colour B
            visited = [0] * (n + 1) # just makes it easy to handle 1-indexing

            for start_idx in range(1, n + 1):
                if visited[start_idx]:
                    continue

                queue = deque([start_idx])

                # colour start:
                visited[start_idx] =  1

                while queue:
                    curr = queue.popleft()
                    curr_color = visited[curr]

                    # colour neighbours:
                    for nei in adj[curr]:
                        if visited[nei] == curr_color:
                            return False

                        if not visited[nei]: # uncoloured
                            visited[nei] = 1 if curr_color == -1 else -1
                            queue.append(nei)

            return True

SOE #

Concentration errors:
- Forget to mark nodes as visited causing infinite loops
- forget to add elements into aux ds-es
TLE issues happen because of unacceptable search spaces:
- when we don’t mark a cell as visited BEFORE adding it into the aux ds (stack / queue) because we introduce redundant things in the search space.
Previous misconceptions:
1. Not every DAG is a tree, a DAG is a superset of a tree. Therefore, to do things like tree-validity checks, we can’t do Kahn’s algo topo-sorting and all that.
Comprehension errors from not slowing down and analysing the question properly
- Not handling return types for trivial vs error cases properly. They might be different (e.g. empty vs impossible grid)
- It seems that in most questions, we have to slow down and ask the basic questions. E.g. if graph, what is a node, what is an edge. If directed edge, what does a \(\rightarrow\) b mean.
  this is necessary for the correct parsing of our question information. Else we might make silly errors.
  Take the Kahn’s Algo from the “Course Schedule I” as an example.
  As usual, we need to care about the direction of the edge within the DAG. In this, the topo sort a \(\rightarrow\) b means that we do a then we do b. Therefore, if the prereq is given such that a \(\rightarrow\) b meaning that b needs to be done BEFORE a, then the adjacency list we build (which represents the graph) should be b \(\rightarrow\) a since b has to be taken before a is taken.
  the learning here is that as usual, keep asking about the basics “what’s the source, what’s the destination, what is the adj list for (for topo) and what is the indegrees for (for guiding our order of traversal)”
Incorrect calculation of entry/exit indegrees in topo sort
Union-find path compression and rank updates errors
For Topo sorting, remember that the adjacency list is built such that a \(\rightarrow\) b means that we do a first then b. Don’t mistake the directionality of prerequisites form the directionality of our topo sorting / graph represented as an adj list.
Forgetting to do cycle detection on graphs (e.g. when we apply kahn’s for topo sorting or other things) don’t just assume that it’s acyclic.

when adding in 2 directed edges for an undirected graph’s adj list, there’s a need to check if we need to do an explicit parent tracking or not. This matters especially in the case when we do DFSes. Here’s an example of when we do this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
    from collections import defaultdict

    class Solution:
        def validTree(self, n: int, edges: List[List[int]]) -> bool:
            # early return based on math: there should be n - 1 edges for a tree with n nodes
            if len(edges) != n - 1:
                return False    # must be exact number of edges

            adj = defaultdict(list)
            for a, b in edges:
                adj[a].append(b)
                adj[b].append(a)

            visited = set()
            stack = [(0, -1)]  # (current, parent)

            while stack:
                node, parent = stack.pop()
                if node in visited:
                    return False    # cycle detected
                visited.add(node)
                for nei in adj[node]:
                    if nei == parent:
                        continue    # don't backtrack
                    stack.append((nei, node))

            return len(visited) == n

Union find boiler plate’s find implementation can be recursively or iteratively written, don’t mix the two.

Wrong:

1
2
3
4
5
    def find(x):
        """Wrong because this mixes both while loop and a recursive call to find()."""
        while parent[x] != x:
            parent[x] = find(parent[x])
        return parent[x]

instead, either of these approaches would work:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
         def find(self, x: int) -> int:
             """
             Find the root of x with path compression.
             This implementation of the find function is recursive in nature for the path compression.
             """
             if self.parent[x] != x:
                 # Path compression: set parent[x] directly to the root
                 self.parent[x] = self.find(self.parent[x])
             return self.parent[x]

         def find_(self, x: int) -> int:
             """
             Iterative implementation for the find function with path compression:
             """
             while parent[x] != x:
                 parent[x] = parent[parent[x]]
                 x = parent[x]

            return x

Decision Flow #

Need connectivity components? → DFS/BFS or Union-Find
Find orderings under dependencies? → Topological sort
Find shortest path unweighted? → BFS

Style, Tricks and Boilerplate #

Boilerplate #

Union Find, Disjoint Sets #

For parent and rank tracking, we could use a dictionary as well if we might have different ways to track a single element (e.g. a tuple of (row, col) for coordinates in a grid).

essential optimisations are the path compression step when implementing the find function and the rank-based “merging”.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
   class UnionFind:
       def __init__(self, n: int):
           # Initially, each node is its own parent and its rank (size) is 1
           self.parent = [i for i in range(n)]
           self.rank = [1] * n  # can also track size instead of rank (then just put 0)

       def find(self, x: int) -> int:
           """
           Find the root of x with path compression.
           This implementation of the find function is recursive in nature for the path compression.
           """
           if self.parent[x] != x:
               # Path compression: set parent[x] directly to the root
               self.parent[x] = self.find(self.parent[x])
           return self.parent[x]

       def find_(self, x: int) -> int:
           """
           Iterative implementation for the find function with path compression:
           """
           while parent[x] != x:
               parent[x] = parent[parent[x]]
               x = parent[x]

          return x

       def union(self, x: int, y: int) -> bool:
           """
           Union the sets containing x and y.
           Returns True if merged, False if already in the same set.
           """
           rootX = self.find(x)
           rootY = self.find(y)

           if rootX == rootY:
               return False  # already connected

           # Union by rank (attach smaller tree to bigger tree)
           if self.rank[rootX] < self.rank[rootY]:
               self.parent[rootX] = rootY
           elif self.rank[rootX] > self.rank[rootY]:
               self.parent[rootY] = rootX
           else:
               self.parent[rootY] = rootX
               self.rank[rootX] += 1

           return True

Bipartite checks #

in this example, we treat it as a graph colouring problem with 2 colours and that’s sufficient for us, here’s a DFS approach:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
class Solution:
    def isBipartite(self, graph: List[List[int]]) -> bool:
        """
        This works for undirected graphs and also unconnected graphs because we start DFS from every node possible.
        """
        color = [0] * len(graph)  # Map node i -> odd=1, even=-1, 0 => unvisited
        stack = []

        for i in range(len(graph)):
            if color[i] != 0: # already coloured
                continue
            color[i] = -1 # set initial state, consider it as a source for the DFS

            stack.append(i)
            while stack: # DFS through
                node = stack.pop()
                for nei in graph[node]:
                    if color[node] == color[nei]: # clashing colours
                        return False
                    elif not color[nei]: # uncoloured, time to colour it
                        stack.append(nei)
                        color[nei] = -1 * color[node] # alternative it

        return True

Cycle Detection #

key problem to solve is: do I have a cycle in my graph?

Directed Graph Cycle detection

DFS with Recursion Stack Approach (graph coloring)

To check if there’s a cycle, we need to consider a general cycle (more than just immediate link-back).
This means that we’d need some way to encode 3 states for nodes.
We can encode (colour) them in any way we want, just need 3 states:
- Unvisited
- Visiting (currently, in this branch exploration)
- Visited (when leaving it)
During DFS, if you encounter a node marked as Visiting in the current path, a cycle exists.

Course Schedule (207) offers an opportunity to solve this via colouring:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
 # M1: GRAPH COLORING DFS APPROACH (3-colour graph (0: unvisited: 1: visiting 2: visited))
 class Solution:
     def canFinish(self, numCourses: int, prerequisites: List[List[int]]) -> bool:
         # list of lists, access using relative idx
         adj = [[] for _ in range(numCourses)]

         # careful: we are ultimately trying to clear the topo sort, so we are building edge from a -> b if a is a prereq of b
         for course, pre in prerequisites:
             adj[pre].append(course)

         # single visited with 3 states, space optimisation!
         visited = [0] * numCourses  # 0 = unvisited, 1 = visiting, 2 = visited

         def dfs(node):
             if visited[node] == 1:  # Cycle!
                 return False

             if visited[node] == 2: # done state, can return early
                 return True

             visited[node] = 1
             for nei in adj[node]:
                 if not dfs(nei):
                     return False

             visited[node] = 2

             return True

         for i in range(numCourses):
             if not dfs(i):
                 return False
         return True

Coloring / Mark and Check Approach
This is the DFS with Recursion stack, just that the mental model we’re using here is that we’re colouring nodes to signal the 3 states.
Purely just a framing of the approach here.

Kahn’s Algorithm (BFS, topo sorting)

If not all nodes are processed (cannot remove all nodes with in-degree 0), the remaining nodes form a cycle.

The key idea here is that a Topo sort is only viable when we have a DAG, that’s why if not everything is processed, there must be a cycle.

We’ve seen this used in Course Schedule I (207)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
 from collections import defaultdict, deque

 class Solution:
     def canFinish(self, numCourses: int, prerequisites: List[List[int]]) -> bool:
         # Build adjacency list and in-degree array
         adj = defaultdict(list)
         # in-degree: For each node, count how many edges point to it = how many prerequisites it has.
         in_degree = [0] * numCourses

         for course, pre in prerequisites:
             adj[pre].append(course)      # prereq → course (edge creation)
             in_degree[course] += 1       # course needs one more prereq

         # 1. Find all courses with no prerequisites. Any element in the queue is "ready to be taken"
         queue = deque(i for i in range(numCourses) if in_degree[i] == 0)
         visited = 0                      # How many courses we've "finished"

         # 2. Process all courses "ready" to be taken
         while queue:
             node = queue.popleft()
             # count as finished / visited
             visited += 1

             for neighbor in adj[node]:   # All courses that depend on this one
                 in_degree[neighbor] -= 1 # One prereq is now done!
                 if in_degree[neighbor] == 0: # this neighbor is ready to be taken, we add it into the queue
                     queue.append(neighbor)

         # 3. If we managed to "take" all courses, there is no cycle!
         return visited == numCourses

Undirected Graph Cycle Detection
- Union-Find Approach Disjoint Set Union
  - Each node starts in its own set.
  - For every edge (to be “added”), check if the two ends have the same parent:
    - If yes, adding the edge would create a cycle.
    - If no, unite their sets (union them).
  - Note: Standard Union-Find does not directly apply to directed graph cycle detection.
- Back-Edge Checking Using Parent Pointers (DFS, Undirected Graphs)
  - Track the parent node on DFS traversal.
  - If you REVISIT a neighbor that is not the parent of the current node, you have a cycle.

Style #

I like the idea of defining ROWS, COLS, DIRECTIONS at the top and consistently making references to that.
Naming preferences:
- nei and neis for “neighbour”, “neighbours”
- careful about not mixing up american and british spellings
Usually time simulations involve a tick-by-tick approach for which BFS is usually a good fit.
when we have to deal with 1-index cases, we can just do dp-like inits with (n + 1) and ignore the 0th index to handle this. That is easier than trying to morph the indices and such.

Tricks #

Traversal direction should be something that we should take a moment to think about. In cases such as the “Pacific Atlantic Water Flow” we realise that the better approach is to go from outwards to in (instead of the actual flow of the water). This is similar to the point where we should consider simpler, complementary approaches to the problems.
This rings true for ANY graph question really because we should always think about how to represent the problem in the best way (what’s the best representation for nodes and edges…).
TRICK: Memory hack: reuse input grids / spaces instead of spawning new ones.
we can reuse the input grid for tracking (e.g. in the floodfill approach for “Number of Islands” problem)
TRICK: reduce search space ASAP
to reduce search spaces, we can ensure that any children added to queues / stacks get marked as visited as we insert into the ds, rather than when we retrieve from the DS.
TRICK: to avoid last-scans for completion, use an aux counter number that we can decrement. IF fully decremented, then the overall state is complete, else it’s an incomplete state.
e.g. Rotting Oranges problem.
TRICK: intermediate safe states while doing the main job then having a formatting area of the code (like “Surround Regions (130)”)
We can mark intermediate safe states to help us. Here’s it’s the ‘*’ that we temporarily tag.
It’s interesting how there’s an intermediate marking state which we then change it back once we have our answers.

we could do topo sorting on a combined supergraph, such as in “Sort Items by Groups Respecting Dependencies (1203)”.

What we did here is that we created a whole new graph called G with it’s indegree. The nodes within these graph were from groups / items and it didn’t matter because we treated them as the same. This is what steps 3 and 4 are about in the solution below.

Then we ran the same Kahn’s Algo (step 5) and checked for cycles. Finally we just filtered out the actual nodes (ignoring the groups).

The supergraphs solution is just tedius, but not impossible to come up with.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
    from collections import defaultdict, deque
    from typing import List

    class Solution:
        def sortItems(self, n: int, m: int, group: List[int],
                      beforeItems: List[List[int]]) -> List[int]:
            """
            Single-graph approach for 1203.
            Instead of 2 separate topo sorts (items, then groups),
            we build one unified DAG containing:
​                - virtual group nodes
​                - item nodes
            and run ONE topological sort on all of them.
            """

            # --- STEP 1: Assign unique group IDs to ungrouped items ---
            # Problem states group[i] == -1 if no group. We give these items
            # their own 'singleton' pseudo-group to enforce adjacency.
            new_group_id = m
            for i in range(n):
                if group[i] == -1:
                    group[i] = new_group_id
                    new_group_id += 1
            total_groups = new_group_id  # updated count after assignment

            # --- STEP 2: Node indexing scheme ---
            # 0..n-1       → item nodes
            # n..n+total_groups-1 → virtual group nodes
            def group_node(g_id: int) -> int:
                """ Map a group ID to its virtual node index in the graph """
                return n + g_id

            # --- Data structures for graph + indegree ---
            G = defaultdict(list)   # adjacency list
            indeg = defaultdict(int)  # indegree dictionary

            # Initialize indegrees to zero for all nodes
            all_nodes = list(range(n)) + [group_node(g) for g in range(total_groups)]
            for node in all_nodes:
                indeg[node] = 0

            # --- STEP 3: Connect each group node to the items it contains ---
            # This ensures items of the same group appear *after* the group's
            # node in topo order, making them contiguous in the final output.
            for item in range(n):
                gnode = group_node(group[item])
                G[gnode].append(item)
                indeg[item] += 1

            # --- STEP 4: Add edges for dependencies ---
            # For every beforeItems[i], item i must appear AFTER prev_item if they're from the same group
            # if they're from different groups then the ordering reflects the ordering of their groups.
            for item in range(n):
                for prev_item in beforeItems[item]:
                    if group[item] == group[prev_item]:
                        # Same group → direct dependency between items
                        G[prev_item].append(item)
                        indeg[item] += 1
                    else:
                        # Different groups → dependency between group nodes
                            G[group_node(group[prev_item])].append(group_node(group[item]))
                       indeg[group_node(group[item])] += 1

            # --- STEP 5: Kahn's Algorithm (BFS Topological Sort) ---
            q = deque([node for node in all_nodes if indeg[node] == 0])
            topo_order = []

            while q:
                u = q.popleft()
                topo_order.append(u)
                for v in G[u]:
                    indeg[v] -= 1
                    if indeg[v] == 0:
                        q.append(v)

            # --- STEP 6: Check for cycles ---
            # A cycle anywhere (items or groups) means no valid order exists.
            if len(topo_order) != len(all_nodes):
                return []

            # --- STEP 7: Extract only items from topo_order ---
            # Group nodes were just helpers to enforce contiguous grouping.
            result = [node for node in topo_order if node < n]
            return result

Topic 11: Graphs

Canonical Question Types #

Graph Traversal (DFS & BFS) #

Problems #

Shortest Path (Unweighted Graph) #

Problems #

Topological Sort & Cycle Detection #

Problems #

Union-Find Disjoint Sets #

Problems #

Graph Construction & Cloning #

Problems #

Grid as Graph Modeling #

Problems #

Bipartite Graph Testing and Coloring #

Problems #

SOE #

Decision Flow #

Style, Tricks and Boilerplate #

Boilerplate #

Union Find, Disjoint Sets #

Bipartite checks #

Cycle Detection #

Style #

Tricks #

TODO KIV #

TODO questions from existing canonicals #

[ ] Traversal::Teleportation family #

TODO new canonicals #

[ ] Edit distance canonicals #