Skip to main content
  1. References/
  2. Algo Reference/
  3. Canonicals/

Topic 8: Tries

··2054 words·10 mins

Canonical Question Types #

Trie Implementation & Basic Operations #

  • This is just about basic trie data structure operations
    • Insert, search, and delete words in trie

Problems #

  1. Implement Trie (208)
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    
        class TrieNode:
            def __init__(self):
                self.children = {} # remember it's an edge-check we aren't really interested in storing a collection of nodes.
                self.terminal = False # or we can directly keep the word here to avoid needing to make a root to node walk for the cumulative value.
    
        class Trie:
            def __init__(self):
                self.root = TrieNode()
    
            def insert(self, word):
                node = self.root
                for ch in word:
                    if ch not in node.children:
                        node.children[ch] = TrieNode()
                    node = node.children[ch]
                node.terminal = True
    
            def search(self, word):
                node = self._traverse(word)
                return node is not None and node.terminal
    
            def startsWith(self, prefix):
                return self._traverse(prefix) is not None
    
            def _traverse(self, string):
                """
                This is a root to node path-walking.
                """
                node = self.root
                for ch in string:
                    if ch not in node.children:
                        return None
                    node = node.children[ch]
                return node
    

Wildcard Matching / Add & Search Word #

  • In general, this will affect how we do searching but the same prefix tree is what gets used.
  • Search with wildcards (’.’) and pattern matching

Problems #

  1. Add & Search Word (211)

    We can see this as a DFS with wildcard matching helping us out for the candidate selection. It’s not that complicated if we calm down about it.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    
        from collections import defaultdict
    
        class WordDictionary:
            def __init__(self):
                self.root = defaultdict(dict)
                self.terminal = "#"
    
            def addWord(self, word: str) -> None:
                node = self.root
                for ch in word:
                    if ch not in node:
                        node[ch] = defaultdict(dict)
                    node = node[ch]
                node[self.terminal] = True
    
            def search(self, word: str) -> bool:
                def dfs(i, node):
                    if i == len(word):
                        return self.terminal in node
                    ch = word[i]
                    if ch == ".":
                        return any(
                            dfs(i + 1, next_node) for key, next_node in node.items() if key != self.terminal
                        )
                    if ch not in node:
                        return False
    
                    return dfs(i + 1, node[ch])
    
                return dfs(0, self.root)
    

    Alternatively, I had this approach, though I think the DFS approach is better because there’s no need to declare a custom TrieNode class of our own here.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    
        class Node:
            def __init__(self, val=None, terminal=False):
                self.children = dict()
                self.val = val
                self.terminal = terminal
    
        class WordDictionary:
            def __init__(self):
                self.root = Node()
                self.wild = "."
    
            def addWord(self, word: str) -> None:
                ptr = self.root
                for char in word:
                    if char not in ptr.children:
                        ptr.children[char] = Node(val=char)
                    ptr = ptr.children[char]
                # terminate:
                ptr.terminal = True
    
                return
        def search(self, word: str) -> bool:
            options = [self.root]
    
            for idx, char in enumerate(word):
                next_options = []
    
                for node in options:
                    if char == self.wild:
                        # Try all possible children
                        next_options.extend(node.children.values())
                    else:
                        if char in node.children:
                            next_options.append(node.children[char])
    
                # No valid paths
                if not next_options:
                    return False
    
                options = next_options
    
            # After processing all characters, check terminal flag
            return any(node.terminal for node in options)
    
  2. TODO Implement Magic Dictionary (676)

Multiword Search (Word Search with Trie) #

  • here the objective is to use a try for a better representation of the search space. We can use tries as supporting, aux datastructures to optimise other problems.

    In these cases, we should remember to consider “turning off” a terminal node once found to avoid infinite recursion / double counting cases.

    • Use trie to speed up multiple word searches in 2D grids

Problems #

  1. Word Search II (212)

    This requires us to find so many words, we want to prune our actions by realising that we can avoid the repeated searches for the common prefixes over and over. So instead of a list, we preprocess as a trie and direct our efforts better. The rest is just DFS, guided by the existence in the trie.

    We also see another optimisation where we can “prune the trie” after finding out a word, we need to effectively off it so that we don’t double count it again.

    Also, we can ignore more dead-ends by removing the node if the child has no children.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    
        from collections import defaultdict
    
        class Node:
            def __init__(self):
                self.children = defaultdict(Node)
                self.word = None
    
        class Solution:
            def findWords(self, board: List[List[str]], words: List[str]) -> List[str]:
                ROWS, COLS = len(board), len(board[0])
                DIRS = [(1, 0), (-1, 0), (0, 1), (0, -1)]
                VISITED = "*"
    
                res = []
    
                # we build the trie:
                root = Node()
                for w in words:
                    ptr = root
                    for c in w:
                        ptr = ptr.children[c]
                    ptr.word = w
    
                def backtrack(r, c, node): # this is more of a DFS rather than "backtracking"
                    cell = board[r][c]
    
                    if cell not in node.children:
                        return
    
                    child = node.children[cell]
                    if child.word:
                        res.append(child.word)
                        child.word = None # set it off
    
                        # OPTIMISATION: remove if the child has no children, which prunes dead-ends:
                        if not child.children:
                            node.children.pop(cell)
    
                    board[r][c] = VISITED
    
                    for row, col in ((r + dr, c + dc) for dr, dc in DIRS if  (
                        0 <= r + dr < ROWS and 0 <= c + dc < COLS and board[r + dr][c + dc] != VISITED
                        )):
                        # choose it:
                        backtrack(row, col, child)
    
                    board[r][c] = cell
    
                    return
    
                for row, col in ((r, c) for r in range(ROWS) for c in range(COLS)):
                    backtrack(row, col, root)
    
                return res
    
  2. TODO Replace Words (648)

Autocomplete & Prefix Counting #

  • Count words with given prefix, autocomplete suggestions (extension)

Problems #

  1. Prefix counting and autocomplete implementations
  2. TODO Search suggestion systems (1268)
  3. TODO Design Search Autocomplete System (642) (paid qns)

Longest Common Prefix & Suffix Queries #

  • Use tries (possibly reversed) to find longest common prefixes or suffixes among a set of strings efficiently.

Problems #

  1. Longest Common Prefix (classic variations)

    1. Longest Common Prefix (14)

      My first-reach solution was to do a try-except as part of the main logic flow, which is kind of unideal.

       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      11
      12
      13
      14
      15
      
            class Solution:
                def longestCommonPrefix(self, strs: List[str]) -> str:
                    if any(not s for s in strs):
                        return ""
      
                    idx = 0
                    while True:
                        try:
                            if len(set(s[idx] for s in strs)) == 1:
                                idx += 1
                            else:
                                return strs[0][:idx]
      
                        except:
                            return strs[0][:idx]
      

      Should try not to use try-excepts as part of the primary logic.

      In this case, we could have implemented a vertical scanning approach like so:

       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      
            class Solution:
                def longestCommonPrefix(self, strs: List[str]) -> str:
                    if not strs:
                        return ""
                    for i in range(len(strs[0])):
                        char = strs[0][i]
                        for s in strs[1:]:
                            if i == len(s) or s[i] != char:
                                return strs[0][:i]
                    return strs[0]
      

      As trie-based solution is a little overkill, but here’s the solution for didactic purposes:

       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      22
      23
      24
      25
      26
      27
      28
      29
      30
      31
      32
      33
      34
      35
      36
      37
      38
      
            class TrieNode:
                def __init__(self):
                    self.children = {}
                    self.is_end_of_word = False
      
            class Trie:
                def __init__(self):
                    self.root = TrieNode()
      
                def insert(self, word):
                    node = self.root
                    for char in word:
                        if char not in node.children:
                            node.children[char] = TrieNode()
                        node = node.children[char]
                    node.is_end_of_word = True
      
                def longest_common_prefix(self):
                    prefix = []
                    node = self.root
                    # Traverse while node has exactly one child and is not end of word
                    while node and not node.is_end_of_word and len(node.children) == 1:
                        char = next(iter(node.children))  # get the single child key
                        prefix.append(char)
                        node = node.children[char]
                    return "".join(prefix)
      
      
            class Solution:
                def longestCommonPrefix(self, strs: list[str]) -> str:
                    if not strs:
                        return ""
                    trie = Trie()
                    for word in strs:
                        if word == "":
                            return ""
                        trie.insert(word)
                    return trie.longest_common_prefix()
      
  2. TODO short encoding of words (820) Shortest Unique Prefix for every word 820

  3. TODO longest word in dictionary (720)

Distinct Substrings / Subsequence Count #

  • Use tries (or suffix tries/trees) to count distinct substrings or perform substring query-related tasks.

Problems #

  1. Count distinct substrings of a string
  2. Number of different sub-strings (variations in suffix trie)

IP Routing / Prefix Matching (Specialized Tries) #

  • Tries are used to implement longest prefix matching in networking (e.g., Routing tables).

Problems #

  • Not common on Leetcode, but classic applications in system design.

Compression & Dictionary Encoding #

  • Tries can be used for compression tasks like dictionary compression or encoding sets of strings.

Problems #

  1. Remove Duplicate Letters (lex order variant that uses trie conceptually)
  2. Optimal prefix codes (Huffman coding, conceptually related)
  3. TODO String Compression (443)

⭐️ Bitwise or Binary Tries (for Integers) #

  • Binary tries store integers bitwise to answer max XOR queries, or minimum/maximum XOR computations.

Problems #

  1. TODO Maximum XOR of Two Numbers in an Array (421)
  2. TODO HARD Maximum XOR With an Element From Array (1707)

Dynamic Updates & Modification in Trie #

  • Insertions and deletions in dynamic sets of strings with efficient query support.

Problems #

  1. Online dictionary updates with prefix queries
  2. Implement data structure supporting add and remove of strings with prefix queries

SOE #

  • Mental Model issues:
    1. seeing tries as a collection of children nodes instead of a map of maps that holds edges (correct)
  • Implementation:
    1. NOT making the most out of using map of maps for this \(\implies\) easiest to just use nested maps here
    2. Forgetting to mark end-of-word in nodes
  • Overusing memory leading to blowup (high branching factor)
  • Poor handling of edge cases such as empty strings or duplicate inserts

Decision Flow #

  • Need prefix query or inserts? → Trie basic operations
  • Wildcard or pattern search? → DFS combined with trie
  • Multiple word queries over grids? → Trie + backtracking
  • Large dictionaries? → Optimize with compressed tries or suffix trees

Style, Tricks, Boilerplate #

Style #

  1. I prefer using a dict instead of children array for this, the key itself can be the edge label and the value can be the TrieNode.
  2. encode the terminus correctly, a flag or the actual work works well. If actual word, we can just get it directly without having to do the whole root to node walk.

Trick #

  1. for tries being used as a input space, remember to prune when necessary e.g. in the Word Search II problem.

Boilerplate #

Classic trie implementation. Remember that it’s going to end up

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
 class TrieNode:
     def __init__(self):
         self.children = {}
         self.terminal = False # or we can directly keep the word here

 class Trie:
     def __init__(self):
         self.root = TrieNode()

     def insert(self, word):
         node = self.root
         for ch in word:
             if ch not in node.children:
                 node.children[ch] = TrieNode()
             node = node.children[ch]
         node.terminal = True

     def search(self, word):
         node = self._traverse(word)
         return node is not None and node.terminal

     def startsWith(self, prefix):
         return self._traverse(prefix) is not None

     def _traverse(self, string):
         node = self.root
         for ch in string:
             if ch not in node.children:
                 return None
             node = node.children[ch]
         return node
  • Also realise that if we don’t need to have a dedicated class for this, we could have chosen to just use a deeply-nested dictionary for this instead of a dedicate TrieNode class.

TODO KIV #

Existing Canonicals #

⭐️binary tries:: TODO Maximum XOR of Two Numbers in an Array (421) #

⭐️binary tries:: TODO HARD Maximum XOR With an Element From Array (1707) #

dynamic modification in trie:: TODO online stock span (901) #

wildcard matching:: TODO Implement Magic Dictionary (676) #

longest common prefix/suffix queries: TODO short encoding of words (820) #

Multiword Search:: TODO Replace Words (648) #

autocomplete/prefix counting:: TODO Search suggestion systems (1268) #

autocomplete:: TODO Design Search Autocomplete System (642) (paid qns) #

compression & dictionary encoding: TODO String Compression (443) #