Topic 8: Tries

Table of Contents

Canonical Question Types #

Trie Implementation & Basic Operations #

This is just about basic trie data structure operations
- Insert, search, and delete words in trie

Problems #

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
    class TrieNode:
        def __init__(self):
            self.children = {} # remember it's an edge-check we aren't really interested in storing a collection of nodes.
            self.terminal = False # or we can directly keep the word here to avoid needing to make a root to node walk for the cumulative value.

    class Trie:
        def __init__(self):
            self.root = TrieNode()

        def insert(self, word):
            node = self.root
            for ch in word:
                if ch not in node.children:
                    node.children[ch] = TrieNode()
                node = node.children[ch]
            node.terminal = True

        def search(self, word):
            node = self._traverse(word)
            return node is not None and node.terminal

        def startsWith(self, prefix):
            return self._traverse(prefix) is not None

        def _traverse(self, string):
            """
            This is a root to node path-walking.
            """
            node = self.root
            for ch in string:
                if ch not in node.children:
                    return None
                node = node.children[ch]
            return node

Wildcard Matching / Add & Search Word #

In general, this will affect how we do searching but the same prefix tree is what gets used.
Search with wildcards (’.’) and pattern matching

Problems #

Add & Search Word (211)

We can see this as a DFS with wildcard matching helping us out for the candidate selection. It’s not that complicated if we calm down about it.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    from collections import defaultdict

    class WordDictionary:
        def __init__(self):
            self.root = defaultdict(dict)
            self.terminal = "#"

        def addWord(self, word: str) -> None:
            node = self.root
            for ch in word:
                if ch not in node:
                    node[ch] = defaultdict(dict)
                node = node[ch]
            node[self.terminal] = True

        def search(self, word: str) -> bool:
            def dfs(i, node):
                if i == len(word):
                    return self.terminal in node
                ch = word[i]
                if ch == ".":
                    return any(
                        dfs(i + 1, next_node) for key, next_node in node.items() if key != self.terminal
                    )
                if ch not in node:
                    return False

                return dfs(i + 1, node[ch])

            return dfs(0, self.root)

Alternatively, I had this approach, though I think the DFS approach is better because there’s no need to declare a custom TrieNode class of our own here.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
    class Node:
        def __init__(self, val=None, terminal=False):
            self.children = dict()
            self.val = val
            self.terminal = terminal

    class WordDictionary:
        def __init__(self):
            self.root = Node()
            self.wild = "."

        def addWord(self, word: str) -> None:
            ptr = self.root
            for char in word:
                if char not in ptr.children:
                    ptr.children[char] = Node(val=char)
                ptr = ptr.children[char]
            # terminate:
            ptr.terminal = True

            return
    def search(self, word: str) -> bool:
        options = [self.root]

        for idx, char in enumerate(word):
            next_options = []

            for node in options:
                if char == self.wild:
                    # Try all possible children
                    next_options.extend(node.children.values())
                else:
                    if char in node.children:
                        next_options.append(node.children[char])

            # No valid paths
            if not next_options:
                return False

            options = next_options

        # After processing all characters, check terminal flag
        return any(node.terminal for node in options)

TODO Implement Magic Dictionary (676)

Multiword Search (Word Search with Trie) #

here the objective is to use a try for a better representation of the search space. We can use tries as supporting, aux datastructures to optimise other problems.
In these cases, we should remember to consider “turning off” a terminal node once found to avoid infinite recursion / double counting cases.
- Use trie to speed up multiple word searches in 2D grids

Problems #

Word Search II (212)

This requires us to find so many words, we want to prune our actions by realising that we can avoid the repeated searches for the common prefixes over and over. So instead of a list, we preprocess as a trie and direct our efforts better. The rest is just DFS, guided by the existence in the trie.

We also see another optimisation where we can “prune the trie” after finding out a word, we need to effectively off it so that we don’t double count it again.

Also, we can ignore more dead-ends by removing the node if the child has no children.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
    from collections import defaultdict

    class Node:
        def __init__(self):
            self.children = defaultdict(Node)
            self.word = None

    class Solution:
        def findWords(self, board: List[List[str]], words: List[str]) -> List[str]:
            ROWS, COLS = len(board), len(board[0])
            DIRS = [(1, 0), (-1, 0), (0, 1), (0, -1)]
            VISITED = "*"

            res = []

            # we build the trie:
            root = Node()
            for w in words:
                ptr = root
                for c in w:
                    ptr = ptr.children[c]
                ptr.word = w

            def backtrack(r, c, node): # this is more of a DFS rather than "backtracking"
                cell = board[r][c]

                if cell not in node.children:
                    return

                child = node.children[cell]
                if child.word:
                    res.append(child.word)
                    child.word = None # set it off

                    # OPTIMISATION: remove if the child has no children, which prunes dead-ends:
                    if not child.children:
                        node.children.pop(cell)

                board[r][c] = VISITED

                for row, col in ((r + dr, c + dc) for dr, dc in DIRS if  (
                    0 <= r + dr < ROWS and 0 <= c + dc < COLS and board[r + dr][c + dc] != VISITED
                    )):
                    # choose it:
                    backtrack(row, col, child)

                board[r][c] = cell

                return

            for row, col in ((r, c) for r in range(ROWS) for c in range(COLS)):
                backtrack(row, col, root)

            return res

TODO Replace Words (648)

Autocomplete & Prefix Counting #

Count words with given prefix, autocomplete suggestions (extension)

Problems #

Prefix counting and autocomplete implementations
TODO Search suggestion systems (1268)
TODO Design Search Autocomplete System (642) (paid qns)

Longest Common Prefix & Suffix Queries #

Use tries (possibly reversed) to find longest common prefixes or suffixes among a set of strings efficiently.

Problems #

Longest Common Prefix (classic variations)

Longest Common Prefix (14)

My first-reach solution was to do a try-except as part of the main logic flow, which is kind of unideal.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
      class Solution:
          def longestCommonPrefix(self, strs: List[str]) -> str:
              if any(not s for s in strs):
                  return ""

              idx = 0
              while True:
                  try:
                      if len(set(s[idx] for s in strs)) == 1:
                          idx += 1
                      else:
                          return strs[0][:idx]

                  except:
                      return strs[0][:idx]

Should try not to use try-excepts as part of the primary logic.

In this case, we could have implemented a vertical scanning approach like so:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
      class Solution:
          def longestCommonPrefix(self, strs: List[str]) -> str:
              if not strs:
                  return ""
              for i in range(len(strs[0])):
                  char = strs[0][i]
                  for s in strs[1:]:
                      if i == len(s) or s[i] != char:
                          return strs[0][:i]
              return strs[0]

As trie-based solution is a little overkill, but here’s the solution for didactic purposes:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
      class TrieNode:
          def __init__(self):
              self.children = {}
              self.is_end_of_word = False

      class Trie:
          def __init__(self):
              self.root = TrieNode()

          def insert(self, word):
              node = self.root
              for char in word:
                  if char not in node.children:
                      node.children[char] = TrieNode()
                  node = node.children[char]
              node.is_end_of_word = True

          def longest_common_prefix(self):
              prefix = []
              node = self.root
              # Traverse while node has exactly one child and is not end of word
              while node and not node.is_end_of_word and len(node.children) == 1:
                  char = next(iter(node.children))  # get the single child key
                  prefix.append(char)
                  node = node.children[char]
              return "".join(prefix)


      class Solution:
          def longestCommonPrefix(self, strs: list[str]) -> str:
              if not strs:
                  return ""
              trie = Trie()
              for word in strs:
                  if word == "":
                      return ""
                  trie.insert(word)
              return trie.longest_common_prefix()

TODO short encoding of words (820) Shortest Unique Prefix for every word 820
TODO longest word in dictionary (720)

Distinct Substrings / Subsequence Count #

Use tries (or suffix tries/trees) to count distinct substrings or perform substring query-related tasks.

Problems #

Count distinct substrings of a string
Number of different sub-strings (variations in suffix trie)

IP Routing / Prefix Matching (Specialized Tries) #

Tries are used to implement longest prefix matching in networking (e.g., Routing tables).

Problems #

Not common on Leetcode, but classic applications in system design.

Compression & Dictionary Encoding #

Tries can be used for compression tasks like dictionary compression or encoding sets of strings.

Problems #

Remove Duplicate Letters (lex order variant that uses trie conceptually)
Optimal prefix codes (Huffman coding, conceptually related)
TODO String Compression (443)

⭐️ Bitwise or Binary Tries (for Integers) #

Binary tries store integers bitwise to answer max XOR queries, or minimum/maximum XOR computations.

Problems #

TODO Maximum XOR of Two Numbers in an Array (421)
TODO HARD Maximum XOR With an Element From Array (1707)

Dynamic Updates & Modification in Trie #

Insertions and deletions in dynamic sets of strings with efficient query support.

Problems #

Online dictionary updates with prefix queries
Implement data structure supporting add and remove of strings with prefix queries

SOE #

Mental Model issues:
1. seeing tries as a collection of children nodes instead of a map of maps that holds edges (correct)
Implementation:
1. NOT making the most out of using map of maps for this \(\implies\) easiest to just use nested maps here
2. Forgetting to mark end-of-word in nodes
Overusing memory leading to blowup (high branching factor)
Poor handling of edge cases such as empty strings or duplicate inserts

Decision Flow #

Need prefix query or inserts? → Trie basic operations
Wildcard or pattern search? → DFS combined with trie
Multiple word queries over grids? → Trie + backtracking
Large dictionaries? → Optimize with compressed tries or suffix trees

Style, Tricks, Boilerplate #

Style #

I prefer using a dict instead of children array for this, the key itself can be the edge label and the value can be the TrieNode.
encode the terminus correctly, a flag or the actual work works well. If actual word, we can just get it directly without having to do the whole root to node walk.

Trick #

for tries being used as a input space, remember to prune when necessary e.g. in the Word Search II problem.

Boilerplate #

Classic trie implementation. Remember that it’s going to end up

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
 class TrieNode:
     def __init__(self):
         self.children = {}
         self.terminal = False # or we can directly keep the word here

 class Trie:
     def __init__(self):
         self.root = TrieNode()

     def insert(self, word):
         node = self.root
         for ch in word:
             if ch not in node.children:
                 node.children[ch] = TrieNode()
             node = node.children[ch]
         node.terminal = True

     def search(self, word):
         node = self._traverse(word)
         return node is not None and node.terminal

     def startsWith(self, prefix):
         return self._traverse(prefix) is not None

     def _traverse(self, string):
         node = self.root
         for ch in string:
             if ch not in node.children:
                 return None
             node = node.children[ch]
         return node

Also realise that if we don’t need to have a dedicated class for this, we could have chosen to just use a deeply-nested dictionary for this instead of a dedicate TrieNode class.

Topic 8: Tries

Canonical Question Types #

Trie Implementation & Basic Operations #

Problems #

Wildcard Matching / Add & Search Word #

Problems #

Multiword Search (Word Search with Trie) #

Problems #

Autocomplete & Prefix Counting #

Problems #

Longest Common Prefix & Suffix Queries #

Problems #

Distinct Substrings / Subsequence Count #

Problems #

IP Routing / Prefix Matching (Specialized Tries) #

Problems #

Compression & Dictionary Encoding #

Problems #

⭐️ Bitwise or Binary Tries (for Integers) #

Problems #

Dynamic Updates & Modification in Trie #

Problems #

SOE #

Decision Flow #

Style, Tricks, Boilerplate #

Style #

Trick #

Boilerplate #

TODO KIV #

Existing Canonicals #

⭐️binary tries:: TODO Maximum XOR of Two Numbers in an Array (421) #

⭐️binary tries:: TODO HARD Maximum XOR With an Element From Array (1707) #

dynamic modification in trie:: TODO online stock span (901) #

wildcard matching:: TODO Implement Magic Dictionary (676) #

longest common prefix/suffix queries: TODO short encoding of words (820) #

Multiword Search:: TODO Replace Words (648) #

autocomplete/prefix counting:: TODO Search suggestion systems (1268) #

autocomplete:: TODO Design Search Autocomplete System (642) (paid qns) #

compression & dictionary encoding: TODO String Compression (443) #