Skip to main content
  1. References/
  2. Algo Reference/
  3. Canonicals/

Topic 1: Arrays & Hashing

··3900 words·19 mins

This includes prefix-sum type questions as well

Canonical Questions #

String simulations #

These problems involve doing some operations on strings. The key idea here is that we should be exploiting string-specific APIs that python provides. Other things could be that instead of actually doing the hard operations, it might make sense to do soft simulations of the applications to solve some of these problems.

Problems #

  1. Process String with Special Operations I (3612)

    I have an on optimal solution that works well enough for the constraints by actually carrying out the processes.

    Optimisations would be amortising things (soft deletes, lazy reversal flags) OR some other way to simulate without the memory overheads.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    
       class Solution:
           def processStr(self, s: str) -> str:
               """
               It's a string state management exercise, might need to find ways to do this efficiently.
    
               Okay this works, a better approach is likely to amortise the expensive operations, keep a "is_reversed" flag and then try not to copy over on the reversed operations and just pop and all that when we need, I guess a soft delete would work as well, and we can filter it out at the end when joining the list into a string.
               """
               res = []
               for char in s:
                   if char == "#":
                       res += res
                       continue
                   if char == "%":
                       res = res[::-1]
                       continue
                   if char == "*":
                       res = res[:-1]
                       continue
    
                   res.append(char)
    
               return "".join(res)
    
  2. Process String with Special Operations II (3614)

    Here, we will observe that if we actually carry out the operations, they are too expensive. Therefore, our objective should be to do soft simulations on these operations. This is a reverse simulation that we’d need to do, so we can do a 2 pass approach on this. First one is to find out the final length of the output string, then we will use the inverse of each process and do some logic there.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    
       class Solution:
           def processStr(self, s: str, k: int) -> str:
               """
               This is virtual simulation of the string operations, instead of directly simulating it.
               We won't construct the string and operate on it, we will work backwards, doing the inverse operations.
               We shall do 2 passes:
               1. find the final length, early return if not satisfactory
               2. in reverse order of the input, we carry out inverse operations. we can early return when the char count matches k.
               """
               # first pass: final length calculation
               size = 0
               for c in s:
                   if c.islower():
                       size += 1
                   elif c == "*" and size > 0: # remover, pops 1 out
                       size -= 1
                   elif c == "#": # duplicator
                       size *= 2
    
               # early returns if out of bounds index
               if not (0 <= k < size):
                   return '.'
    
               # 2nd pass, inverse simulation:
               for c in reversed(s):
                   if c.islower():
                       size -= 1
                       # found the char:
                       if k == size:
                           return c
                   elif c == "*": # inverse of poping is pushing
                       size += 1
                   elif c == "#":  # inverse of doubling is halving, but need to be careful if we need to rotate k
                       size //= 2
                       if k >= size: # is in theright half
                           k -= size
                   elif c == "%": # need to flip it:
                       k = size - 1 - k
    
               return "."
    
  3. Sort Vowels in a String (2785)

    This is just a simple string manipulation, we’re just focused on constructing something efficiently. My solution uses a Counter and a generator (iterator) to spit out the next values for vowels, when we come across them. Some useful tips for this would be to directly use sets instead of lists for such direct membership checks with characters.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    
       class Solution:
           def sortVowels(self, s: str) -> str:
               vowels = set("aeiouAEIOU") # set allows for fast membership checks
               sorted_vowels = sorted([ch for ch in s if ch in vowels])
               res = []
               idx = 0
               for ch in s:
                   if ch in vowels:
                       res.append(sorted_vowels[idx])
                       idx += 1
                   else:
                       res.append(ch)
               return "".join(res)
    

Prefix sums / Subarray Sum = K (prefix sum + hashmap) #

  • The key idea for such problems is to rely on preprocessed prefix/suffix arrays to help make local decisions better.

    • These problems may also deal with contiguous segments (subarrays) and searching for some condition to be met.
    • The idea here is that we can create prefix sums and that can be our target that we might want to search for, so it’s also like a 2sum searchability problem
    • Also classic use when we care about range sums and the like.
  • this type of problem is actually also a DP problem. The main reason why is that the order of filling matters for us (so that we avoid double counting).

Problems #

  1. Product of Array Except Self (238)

    This is a classic problem. It’s just a prefix / suffix processing that we need to do. It’s just a 2 pass that we do here. The intuition is that we need to rely on partial running subs from left/right.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    
       class Solution:
           def productExceptSelf(self, nums: List[int]) -> List[int]:
               n = len(nums)
               output = [1] * n
    
               # handle the prefixes first
               prefix = 1
               for i in range(n):
                   output[i] = prefix
                   prefix *= nums[i]
    
               # now handlethe suffixes
               suffix = 1
               for i in range(n - 1, -1, -1):
                   output[i] *= suffix
                   suffix *= nums[i]
    
               return output
    
  2. Trionic Array I (3637)

    This just does a single pass of 3 monotonic regions and ensures other things such as the length of the segments are appropriate. The solution presented below just does a direct sweep for things instead of actually using prefix/suffix sums.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    
       class Solution:
           def isTrionic(self, nums: List[int]) -> bool:
               """
               3 partitions, they are strictly increasing, strictly decreasing, strictly increasing.
    
               consider the boundaries
               a, +, a, -/0,b, -,b, -, b, +/0, c, +, c
               boundaries can be 0 or n
    
               We also realise that we need to form 3 segments, we aren't looking for turning points necessarily, we are looking for 3 segments.
    
               m1 failed: if I just use triples
               """
               n = len(nums)
               if n < 4:
                   return False # can't form 2 segments with less than 4 elements
    
               # 1: find end of strictly increasing segment:
               p = 0
               while p < n - 1 and nums[p] < nums[p + 1]:
                   p += 1
               # check if failure:
               if p == 0:
                   return False
    
               # 2: find end of strictly decreasing:
               q = p
               while q < n - 1 and nums[q] > nums[q + 1]:
                   q += 1
    
               # for us to end, the final segment must have at least one element and q must have moved from p
               if q == p or q == n - 1:
                   return False
    
               # 3: check final segment is strictly decreasing:
               i = q
               while i < n - 1 and nums[i] < nums[i + 1]:
                   i += 1
    
               # i should be n - 1 (reached the end)
               return i == n - 1
    
  3. Trionic Array II (3640)

    This is a nifty use of prefix sums. I think this is one of the rare instances where most of the work is to determine what to precompute.

    We should think of it as a search for ALL the trionic subarrays, which we shall use precomputed prefix sums for. The rough idea is to look for 3 segments (formed by 2 turning points, one peak, one trough). We shall move a pointer representing the middle peak idx, then once we find it, we shall attempt to expand rightward to find the trough then we shall use the precomputed suffix array to get the best right hand side value.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    
       class Solution:
           def maxSumTrionic(self, nums: List[int]) -> int:
               """
               Scan for 3 segments: A: strictly increasing prefix,
               B: strictly decreasing middle, C: strictly increasing suffix.
    
               Precompute for every index:
    ​               - inc_sum_left: max sum of a strictly increasing subarray ending at each position (from the left)
    ​               - inc_sum_right: max sum of a strictly increasing subarray starting at each position (from the right)
    
               The main pass looks for peaks, then greedily extends the decreasing segment as much as possible.
               Then checks that after the decreasing segment, another increase exists -- if so, computes the sum by
               joining the 3 regions.
               """
    
               n = len(nums)
    
               inc_sum_left = [0] * n  # max sum of strictly increasing subarray ending at each i (left-to-right)
               inc_sum_right = [0] * n # max sum of strictly increasing subarray starting at each i (right-to-left)
    
               inc_sum_left[0] = nums[0]
               inc_sum_right[-1] = nums[-1]
    
               # Precompute left-to-right strictly increasing sums (for segment A)
               for i in range(1, n):
                   if nums[i] > nums[i-1]:
                       inc_sum_left[i] = max(inc_sum_left[i-1] + nums[i], nums[i])
                   else:
                       inc_sum_left[i] = nums[i]
    
               # Precompute right-to-left strictly increasing sums (for segment C)
               for i in range(n-2, -1, -1):
                   if nums[i] < nums[i+1]:
                       inc_sum_right[i] = max(inc_sum_right[i+1] + nums[i], nums[i])
                   else:
                       inc_sum_right[i] = nums[i]
    
               i = 0
               ans = float("-inf")
    
               while i < n:
                   # Looking for a peak: left increasing, peak, then decreasing middle
                   if i+2 < n and nums[i] < nums[i+1] and nums[i+1] > nums[i+2]:
                       curr_sum = 0 # curr_sum shall hold the sum of the decreasing segment.
                       j = i+1
                       # Expand the strictly decreasing region (middle segment B)
                       while j+1 < n and nums[j] > nums[j+1]:
                           curr_sum += nums[j]
                           j += 1
    
                       # If another increasing segment exists after the trough
                       if j+1 < n and nums[j] < nums[j+1]:
                           curr_sum += nums[j]
                           trionic_subarray_sum = inc_sum_left[i] + curr_sum + inc_sum_right[j+1]
                           ans = max(ans, trionic_subarray_sum)
    
                   i += 1
    
               return ans
    
  4. Maximum Number of Subsequences After One Inserting (3628)

    We are asked for max number of subsequences wherein the elements don’t have to be contiguous but need to preserve relative rank order. This is what asks us to try doing a prefix / suffix counting approach.

    We keep prefix and suffix partial subsequences COUNTS for every index and then think how to use the insertion. For the insertion, we can insert L, C or T. For L and T there’s a single best case (add to leftmost and rightmost respectively) for C, there’s a need to comb through.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    
       class Solution:
           def numOfSubsequences(self, s: str) -> int:
               """
               For this, the first thought that comes to mind is to consider prefix counts of partial subsequences.
    
               So for each i, we should try count all
    ​           - # of "L"s until that point
    ​           - # of "LC"s until that point
    ​           - # of "LCT"s until that point
    
               These partial counts will be useful for us to get the final values without insertion.
    
               Now we consider the insertion. The question is probably, which partial sum is the highest jump so that we can add it there? Not too sure how to factor in the "at most one insertion" part.
    
               For the insertions, what we insert matters for the cases:
               1. L: should only be inserted at the start since it's the first (for max benefit)
               2. C: should be inserted at any points midway to do the L_T properly
               3. T: should only be inserted at the end (mirrors argument for case 1)
    
               We take the max gain from these 3 choices
               """
    
               n = len(s)
               # precompute prefix counts for L and LC subsequences
               preL = [0] * (n + 1)
               preLC = [0] * (n + 1)
               for i in range(n):
                   preL[i + 1] = preL[i] + (1 if s[i] == 'L' else 0)
                   preLC[i + 1] = preLC[i] + (preL[i] if s[i] == 'C' else 0)
    
               # precompute suffix counts for C, CT pairs, and T (the C and T help us form intermediates to get CT)
               sufC = [0] * (n + 1)
               sufCT = [0] * (n + 1)
               sufT = [0] * (n + 1)
               for i in reversed(range(n)):
                   sufC[i] = sufC[i + 1]
                   sufCT[i] = sufCT[i + 1]
                   sufT[i] = sufT[i + 1]
                   if s[i] == 'T':
                       sufT[i] += 1
                   elif s[i] == 'C':
                       sufC[i] += 1
                       sufCT[i] += sufT[i]
    
               # count initial number of "LCT" subsequences
               countL = countLC = countLCT = 0
               for c in s:
                   match c:
                       case 'L':
                           countL += 1
                       case 'C':
                           countLC += countL
                       case 'T':
                           countLCT += countLC
    
               res = countLCT
    
               # insert L at start: adds all "CT" pairs
               res = max(res, countLCT + sufCT[0])
    
               # insert T at end: adds all "LC" pairs
               res = max(res, countLCT + preLC[n])
    
               # insert C at every position: max over L before * T after
               for i in range(n + 1):
                   res = max(res, countLCT + preL[i] * sufT[i])
    
               return res
    

Sequences #

Problems #

  1. Longest Consecutive Sequence (128): it’s more about the elements forming an consecutive sequence

    this is more about finding out whether we have a “noop ladder” like in ROP hacking. Solution: create a set of nums (\(O(n)\)), then for each num, check if it’s the starting num (( n - 1 ) not in num_set) then just keep climbing up that ladder until no more, keep track of max length. Remember, only try to build sequences from the start of a sequence.

    This is such a simple way of looking at the problem, our first reach should NOT be some kind of interval merging or something. The input variables already hint that it’s a linear sweep of some sort.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    
       class Solution:
           def longestConsecutive(self, nums: List[int]) -> int:
               num_set = set(nums)
               longest = 0
    
               for n in num_set:
                   # if this n is the start of the "noop ladder":
                   if n - 1 not in num_set:
                       length = 1
    
                       # build sequences from a new start:
                       while n + length in num_set:
                           length += 1
    
                       longest = max(longest, length)
    
               return longest
    

NO Dynamic Range Querying using Fenwick Trees (Binary Index Trees) #

  • CLOSING NOTE [2025-12-29 Mon 11:10]
    I stopped prepping for it, I’ll just skip this task until I need to come back to interview prep again
  • this kind of problems deal with needing to do prefix sums and also updating them efficiently. They are also extremely niche and a bunch of hard questions in the more recent leetcode questions only accept such solutions. Typically, we can still achieve partial solutions using DP approaches, but will likely timeout unless fenwick trees are used.

Problems #

  1. Number of Integers with Popcount-Depth Equal to K II (3624)

    The key idea here is that we need to accumulate interval counts of the popcount depths for depths in range [0, 5], so we have 6 different fenwick trees to track this. Querying comes in 2 forms: range queries and update queries and both can be handled in \(O(log n)\) time thanks to the use of fenwick trees.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    
        from typing import List
        from functools import cache
    
        class FenwickTree:
            def __init__(self, n):
                self.n = n
                self.fw = [0]*(n+1)
    
            def update(self, i, delta):
                i += 1
                while i <= self.n:
                    self.fw[i] += delta
                    i += i & (-i)
    
            def query(self, i):
                i += 1
                s = 0
                while i > 0:
                    s += self.fw[i]
                    i -= i & (-i)
                return s
    
            def range_query(self, l, r):
                return self.query(r) - (self.query(l-1) if l > 0 else 0)
    
        class Solution:
            def popcountDepth(self, nums: List[int], queries: List[List[int]]) -> List[int]:
                @cache
                def get_depth(x):
                    depth = 0
                    while x != 1:
                        x = x.bit_count()
                        depth += 1
                    return depth
    
                n = len(nums)
                max_depth = 5
    
                depths = [get_depth(x) for x in nums] # stores the current depths of each num, idxed by the nums idx
                # we build a fenwick tree, each tree will contain the prefix sum for the number of nums with depth = k (fixed) within
                fenwicks = [FenwickTree(n) for _ in range(max_depth+1)]
    
                # Initialize Fenwicks
                for i, d in enumerate(depths):
                    fenwicks[d].update(i, 1)
    
                ans = []
                for query in queries:
                    match query:
                        case 1, l, r, k:
                            num_indices = fenwicks[k].range_query(l, r)
                            ans.append(num_indices)
                        case 2, idx, val:
                            old_depth, new_depth = depths[idx], get_depth(val)
                            if old_depth == new_depth:
                                continue
                            fenwicks[old_depth].update(idx, -1)
                            fenwicks[new_depth].update(idx, 1)
                            depths[idx] = new_depth
    
                return ans
    
  2. Sum of Beautiful Subsequences (3671)

    To be honest, I don’t understand this. The DP version will pass 80% of the test cases, I most likely will end up resorting to that. The fenwick tree solutions, I don’t even have an intuition for it.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    91
    92
    93
    94
    
        # M1: Fenwick Tree solution (correct, passes all)
        class Fenwick:
            def __init__(self, n):
                self.arr = [0] * (n + 1)
    
            def sum(self, i):
                s = 0
                while i > 0:
                    s += self.arr[i]
                    i -= i & (-i)
                return s
    
            def add(self, i, x):
                while i < len(self.arr):
                    self.arr[i] += x
                    i += i & (-i)
    
        class Solution:
            def totalBeauty(self, nums: List[int]) -> int:
                MOD = 10**9 + 7
                max_val = max(nums) + 1
                locs = defaultdict(list)
                for idx, val in enumerate(nums):
                    locs[val].append(idx)
    
                F = [0] * max_val
                for d in range(1, max_val):
                    # indices of all numbers divisible by d
                    indices = sorted(pos for val in range(d, max_val, d) for pos in locs[val])
                    if len(indices) <= 1:
                        F[d] = len(indices)
                        continue
    
                    rank = {pos: r for r, pos in enumerate(indices, 1)}  # rank compress
                    fenw = Fenwick(len(indices))
                    for val in range(d, max_val, d):
                        for pos in reversed(locs[val]):
                            r = rank[pos]
                            new_subseq_count = 1 + fenw.sum(r - 1)
                            F[d] += new_subseq_count
                            fenw.add(r, new_subseq_count)
    
                for d in range(max_val - 1, 0, -1):
                    for multiple in range(d * 2, max_val, d):
                        F[d] -= F[multiple]
                    F[d] %= MOD
    
                return sum((d * F[d]) % MOD for d in range(1, max_val)) % MOD
    
        # M2::PARTIAL: dp solution:
        from math import gcd
        from collections import defaultdict
    
        class Solution:
           def totalBeauty(self, nums: List[int]) -> int:
               """
               I think this is a multi step question as well
    
               First we just find all the subsequences (and the GCDs within them)
    
               Then we combine via the GCD values and get the beauty scores
    
               == steps
               {problem decomposition} ==> naive
               1. enum strictly increasing subsequences in the array
               2. for each subsequence, find its gcd
               3. for each GCD, g, count subsequences with gcd = g
               4. calculate
    
               == key intuition for optimisation:
               1. directly enumerating is costly 2^n, so we have to build up the results.
               2. so we use dp 2D where dp[i][g] is the number of strictly increasing subsequences ending at idx i with gcd g
               """
    
               MOD = (10 ** 9) + 7
               n = len(nums)
               total_beauty = 0
    
               dp = [defaultdict(int) for _ in range(n)] # so dp[i][g] gives number of strictly increasing subsequences ending at index i with gcd g.
    
               for i in range(n):
                   dp[i][nums[i]] = 1 # init value, it's gcd 1 for itself
                   for j in range(i): # left pointer
                       if nums[j] < nums[i]: # then can accumulate more GCDs
                           for g, count in dp[j].items():
                               new_g = gcd(g, nums[i])
                               new_count = dp[i][new_g] + count
                               dp[i][new_g] = new_count % MOD
    
                   # now we calculate beauty values:
                   for g, count in dp[i].items():
                       total_beauty = (total_beauty + (g * count)) % MOD
    
               return total_beauty
    

N-Sum Family #

Problems #

  • N-Sum Family
    • 2-sum is about complement finding, pairwise just keep a seen index and use that when searching for complements further down

TODO Sudoku Family #

  • Sudoku Family
    • n-dimensional array access is the main learning point here. When we access blocks, then for each cell, we can get which block they’re from using block_idx = row_idx // 3, col_idx // 3.

TODO Stock Family #

SOE #

General #

  • Forgetting negatives / zeros in prefix sums \(\implies\) this affects the order of filling / presences of duplicate prefix sums in my list of prefix sums

Jumping the Gun to stdlib thoughts #

  • Off-by-one in subarray loops
  • Jumping straight to “pythonic” stdlib uses is NOT good. Just think about the minimal implementation then on the second look at it, think of stdlib uses. Need to avoid cases where we interpret the question wrongly. E.g. for “Top K Frequent Elements (347)”, there should be no reason to use a heapq, we can just directly sort based on counted values.

Disambiguating Sliding window vs prefix/suffix accumulation #

  • Sliding window vs prefix/suffix accumulation is a common pitfall
    • If the problem wants you to process variable subarrays, or is about max-min over all partitions (not just windows that move by 1), sliding window is probably not a fit.

Decision Flow #

  • Is it lookup/search? → HashMap

  • Is it range sum? → Prefix sums

  • Do we care about some properties of subsequences (where the contiguity not important BUT the relative order matters) -> consider using prefix/suffix counting, which may be a state-tracking kind of counting as well.

  • Is the problem about a ‘window’ (i.e. strict subarray of size K / variable) or is it accumulating something from start / end

    • if window: most likely Sliding window
    • if accumulation / arbitrary splits then it’s most likely: Prefix/Suffix tracking
  • Char counting / Freq counting \(\rightarrow\) use fixed array counter if is fixed space else use collections.Counter

Style, Recipes, Boilerplate #

  1. Better to early return instead of more code-golf like solutions
  2. love the use of defaultdict, careful on the choice of keys (must be hashable)
  3. use genexps for intermediate lists where possible