Topic 1: Arrays & Hashing

Table of Contents

This includes prefix-sum type questions as well

Canonical Questions #

String simulations #

These problems involve doing some operations on strings. The key idea here is that we should be exploiting string-specific APIs that python provides. Other things could be that instead of actually doing the hard operations, it might make sense to do soft simulations of the applications to solve some of these problems.

Problems #

Process String with Special Operations I (3612)

I have an on optimal solution that works well enough for the constraints by actually carrying out the processes.

Optimisations would be amortising things (soft deletes, lazy reversal flags) OR some other way to simulate without the memory overheads.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
   class Solution:
       def processStr(self, s: str) -> str:
           """
           It's a string state management exercise, might need to find ways to do this efficiently.

           Okay this works, a better approach is likely to amortise the expensive operations, keep a "is_reversed" flag and then try not to copy over on the reversed operations and just pop and all that when we need, I guess a soft delete would work as well, and we can filter it out at the end when joining the list into a string.
           """
           res = []
           for char in s:
               if char == "#":
                   res += res
                   continue
               if char == "%":
                   res = res[::-1]
                   continue
               if char == "*":
                   res = res[:-1]
                   continue

               res.append(char)

           return "".join(res)

Process String with Special Operations II (3614)

Here, we will observe that if we actually carry out the operations, they are too expensive. Therefore, our objective should be to do soft simulations on these operations. This is a reverse simulation that we’d need to do, so we can do a 2 pass approach on this. First one is to find out the final length of the output string, then we will use the inverse of each process and do some logic there.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
   class Solution:
       def processStr(self, s: str, k: int) -> str:
           """
           This is virtual simulation of the string operations, instead of directly simulating it.
           We won't construct the string and operate on it, we will work backwards, doing the inverse operations.
           We shall do 2 passes:
           1. find the final length, early return if not satisfactory
           2. in reverse order of the input, we carry out inverse operations. we can early return when the char count matches k.
           """
           # first pass: final length calculation
           size = 0
           for c in s:
               if c.islower():
                   size += 1
               elif c == "*" and size > 0: # remover, pops 1 out
                   size -= 1
               elif c == "#": # duplicator
                   size *= 2

           # early returns if out of bounds index
           if not (0 <= k < size):
               return '.'

           # 2nd pass, inverse simulation:
           for c in reversed(s):
               if c.islower():
                   size -= 1
                   # found the char:
                   if k == size:
                       return c
               elif c == "*": # inverse of poping is pushing
                   size += 1
               elif c == "#":  # inverse of doubling is halving, but need to be careful if we need to rotate k
                   size //= 2
                   if k >= size: # is in theright half
                       k -= size
               elif c == "%": # need to flip it:
                   k = size - 1 - k

           return "."

Sort Vowels in a String (2785)

This is just a simple string manipulation, we’re just focused on constructing something efficiently. My solution uses a Counter and a generator (iterator) to spit out the next values for vowels, when we come across them. Some useful tips for this would be to directly use sets instead of lists for such direct membership checks with characters.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
   class Solution:
       def sortVowels(self, s: str) -> str:
           vowels = set("aeiouAEIOU") # set allows for fast membership checks
           sorted_vowels = sorted([ch for ch in s if ch in vowels])
           res = []
           idx = 0
           for ch in s:
               if ch in vowels:
                   res.append(sorted_vowels[idx])
                   idx += 1
               else:
                   res.append(ch)
           return "".join(res)

Prefix sums / Subarray Sum = K (prefix sum + hashmap) #

The key idea for such problems is to rely on preprocessed prefix/suffix arrays to help make local decisions better.
- These problems may also deal with contiguous segments (subarrays) and searching for some condition to be met.
- The idea here is that we can create prefix sums and that can be our target that we might want to search for, so it’s also like a 2sum searchability problem
- Also classic use when we care about range sums and the like.
this type of problem is actually also a DP problem. The main reason why is that the order of filling matters for us (so that we avoid double counting).

Problems #

Product of Array Except Self (238)

This is a classic problem. It’s just a prefix / suffix processing that we need to do. It’s just a 2 pass that we do here. The intuition is that we need to rely on partial running subs from left/right.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
   class Solution:
       def productExceptSelf(self, nums: List[int]) -> List[int]:
           n = len(nums)
           output = [1] * n

           # handle the prefixes first
           prefix = 1
           for i in range(n):
               output[i] = prefix
               prefix *= nums[i]

           # now handlethe suffixes
           suffix = 1
           for i in range(n - 1, -1, -1):
               output[i] *= suffix
               suffix *= nums[i]

           return output

Trionic Array I (3637)

This just does a single pass of 3 monotonic regions and ensures other things such as the length of the segments are appropriate. The solution presented below just does a direct sweep for things instead of actually using prefix/suffix sums.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
   class Solution:
       def isTrionic(self, nums: List[int]) -> bool:
           """
           3 partitions, they are strictly increasing, strictly decreasing, strictly increasing.

           consider the boundaries
           a, +, a, -/0,b, -,b, -, b, +/0, c, +, c
           boundaries can be 0 or n

           We also realise that we need to form 3 segments, we aren't looking for turning points necessarily, we are looking for 3 segments.

           m1 failed: if I just use triples
           """
           n = len(nums)
           if n < 4:
               return False # can't form 2 segments with less than 4 elements

           # 1: find end of strictly increasing segment:
           p = 0
           while p < n - 1 and nums[p] < nums[p + 1]:
               p += 1
           # check if failure:
           if p == 0:
               return False

           # 2: find end of strictly decreasing:
           q = p
           while q < n - 1 and nums[q] > nums[q + 1]:
               q += 1

           # for us to end, the final segment must have at least one element and q must have moved from p
           if q == p or q == n - 1:
               return False

           # 3: check final segment is strictly decreasing:
           i = q
           while i < n - 1 and nums[i] < nums[i + 1]:
               i += 1

           # i should be n - 1 (reached the end)
           return i == n - 1

Trionic Array II (3640)

This is a nifty use of prefix sums. I think this is one of the rare instances where most of the work is to determine what to precompute.

We should think of it as a search for ALL the trionic subarrays, which we shall use precomputed prefix sums for. The rough idea is to look for 3 segments (formed by 2 turning points, one peak, one trough). We shall move a pointer representing the middle peak idx, then once we find it, we shall attempt to expand rightward to find the trough then we shall use the precomputed suffix array to get the best right hand side value.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
   class Solution:
       def maxSumTrionic(self, nums: List[int]) -> int:
           """
           Scan for 3 segments: A: strictly increasing prefix,
           B: strictly decreasing middle, C: strictly increasing suffix.

           Precompute for every index:
​               - inc_sum_left: max sum of a strictly increasing subarray ending at each position (from the left)
​               - inc_sum_right: max sum of a strictly increasing subarray starting at each position (from the right)

           The main pass looks for peaks, then greedily extends the decreasing segment as much as possible.
           Then checks that after the decreasing segment, another increase exists -- if so, computes the sum by
           joining the 3 regions.
           """

           n = len(nums)

           inc_sum_left = [0] * n  # max sum of strictly increasing subarray ending at each i (left-to-right)
           inc_sum_right = [0] * n # max sum of strictly increasing subarray starting at each i (right-to-left)

           inc_sum_left[0] = nums[0]
           inc_sum_right[-1] = nums[-1]

           # Precompute left-to-right strictly increasing sums (for segment A)
           for i in range(1, n):
               if nums[i] > nums[i-1]:
                   inc_sum_left[i] = max(inc_sum_left[i-1] + nums[i], nums[i])
               else:
                   inc_sum_left[i] = nums[i]

           # Precompute right-to-left strictly increasing sums (for segment C)
           for i in range(n-2, -1, -1):
               if nums[i] < nums[i+1]:
                   inc_sum_right[i] = max(inc_sum_right[i+1] + nums[i], nums[i])
               else:
                   inc_sum_right[i] = nums[i]

           i = 0
           ans = float("-inf")

           while i < n:
               # Looking for a peak: left increasing, peak, then decreasing middle
               if i+2 < n and nums[i] < nums[i+1] and nums[i+1] > nums[i+2]:
                   curr_sum = 0 # curr_sum shall hold the sum of the decreasing segment.
                   j = i+1
                   # Expand the strictly decreasing region (middle segment B)
                   while j+1 < n and nums[j] > nums[j+1]:
                       curr_sum += nums[j]
                       j += 1

                   # If another increasing segment exists after the trough
                   if j+1 < n and nums[j] < nums[j+1]:
                       curr_sum += nums[j]
                       trionic_subarray_sum = inc_sum_left[i] + curr_sum + inc_sum_right[j+1]
                       ans = max(ans, trionic_subarray_sum)

               i += 1

           return ans

Maximum Number of Subsequences After One Inserting (3628)

We are asked for max number of subsequences wherein the elements don’t have to be contiguous but need to preserve relative rank order. This is what asks us to try doing a prefix / suffix counting approach.

We keep prefix and suffix partial subsequences COUNTS for every index and then think how to use the insertion. For the insertion, we can insert L, C or T. For L and T there’s a single best case (add to leftmost and rightmost respectively) for C, there’s a need to comb through.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
   class Solution:
       def numOfSubsequences(self, s: str) -> int:
           """
           For this, the first thought that comes to mind is to consider prefix counts of partial subsequences.

           So for each i, we should try count all
​           - # of "L"s until that point
​           - # of "LC"s until that point
​           - # of "LCT"s until that point

           These partial counts will be useful for us to get the final values without insertion.

           Now we consider the insertion. The question is probably, which partial sum is the highest jump so that we can add it there? Not too sure how to factor in the "at most one insertion" part.

           For the insertions, what we insert matters for the cases:
           1. L: should only be inserted at the start since it's the first (for max benefit)
           2. C: should be inserted at any points midway to do the L_T properly
           3. T: should only be inserted at the end (mirrors argument for case 1)

           We take the max gain from these 3 choices
           """

           n = len(s)
           # precompute prefix counts for L and LC subsequences
           preL = [0] * (n + 1)
           preLC = [0] * (n + 1)
           for i in range(n):
               preL[i + 1] = preL[i] + (1 if s[i] == 'L' else 0)
               preLC[i + 1] = preLC[i] + (preL[i] if s[i] == 'C' else 0)

           # precompute suffix counts for C, CT pairs, and T (the C and T help us form intermediates to get CT)
           sufC = [0] * (n + 1)
           sufCT = [0] * (n + 1)
           sufT = [0] * (n + 1)
           for i in reversed(range(n)):
               sufC[i] = sufC[i + 1]
               sufCT[i] = sufCT[i + 1]
               sufT[i] = sufT[i + 1]
               if s[i] == 'T':
                   sufT[i] += 1
               elif s[i] == 'C':
                   sufC[i] += 1
                   sufCT[i] += sufT[i]

           # count initial number of "LCT" subsequences
           countL = countLC = countLCT = 0
           for c in s:
               match c:
                   case 'L':
                       countL += 1
                   case 'C':
                       countLC += countL
                   case 'T':
                       countLCT += countLC

           res = countLCT

           # insert L at start: adds all "CT" pairs
           res = max(res, countLCT + sufCT[0])

           # insert T at end: adds all "LC" pairs
           res = max(res, countLCT + preLC[n])

           # insert C at every position: max over L before * T after
           for i in range(n + 1):
               res = max(res, countLCT + preL[i] * sufT[i])

           return res

Sequences #

Problems #

Longest Consecutive Sequence (128): it’s more about the elements forming an consecutive sequence

this is more about finding out whether we have a “noop ladder” like in ROP hacking. Solution: create a set of nums (\(O(n)\)), then for each num, check if it’s the starting num (( n - 1 ) not in num_set) then just keep climbing up that ladder until no more, keep track of max length. Remember, only try to build sequences from the start of a sequence.

This is such a simple way of looking at the problem, our first reach should NOT be some kind of interval merging or something. The input variables already hint that it’s a linear sweep of some sort.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
   class Solution:
       def longestConsecutive(self, nums: List[int]) -> int:
           num_set = set(nums)
           longest = 0

           for n in num_set:
               # if this n is the start of the "noop ladder":
               if n - 1 not in num_set:
                   length = 1

                   # build sequences from a new start:
                   while n + length in num_set:
                       length += 1

                   longest = max(longest, length)

           return longest

NO Dynamic Range Querying using Fenwick Trees (Binary Index Trees) #

CLOSING NOTE [2025-12-29 Mon 11:10]
I stopped prepping for it, I’ll just skip this task until I need to come back to interview prep again
this kind of problems deal with needing to do prefix sums and also updating them efficiently. They are also extremely niche and a bunch of hard questions in the more recent leetcode questions only accept such solutions. Typically, we can still achieve partial solutions using DP approaches, but will likely timeout unless fenwick trees are used.

Problems #

Number of Integers with Popcount-Depth Equal to K II (3624)

The key idea here is that we need to accumulate interval counts of the popcount depths for depths in range [0, 5], so we have 6 different fenwick trees to track this. Querying comes in 2 forms: range queries and update queries and both can be handled in \(O(log n)\) time thanks to the use of fenwick trees.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
    from typing import List
    from functools import cache

    class FenwickTree:
        def __init__(self, n):
            self.n = n
            self.fw = [0]*(n+1)

        def update(self, i, delta):
            i += 1
            while i <= self.n:
                self.fw[i] += delta
                i += i & (-i)

        def query(self, i):
            i += 1
            s = 0
            while i > 0:
                s += self.fw[i]
                i -= i & (-i)
            return s

        def range_query(self, l, r):
            return self.query(r) - (self.query(l-1) if l > 0 else 0)

    class Solution:
        def popcountDepth(self, nums: List[int], queries: List[List[int]]) -> List[int]:
            @cache
            def get_depth(x):
                depth = 0
                while x != 1:
                    x = x.bit_count()
                    depth += 1
                return depth

            n = len(nums)
            max_depth = 5

            depths = [get_depth(x) for x in nums] # stores the current depths of each num, idxed by the nums idx
            # we build a fenwick tree, each tree will contain the prefix sum for the number of nums with depth = k (fixed) within
            fenwicks = [FenwickTree(n) for _ in range(max_depth+1)]

            # Initialize Fenwicks
            for i, d in enumerate(depths):
                fenwicks[d].update(i, 1)

            ans = []
            for query in queries:
                match query:
                    case 1, l, r, k:
                        num_indices = fenwicks[k].range_query(l, r)
                        ans.append(num_indices)
                    case 2, idx, val:
                        old_depth, new_depth = depths[idx], get_depth(val)
                        if old_depth == new_depth:
                            continue
                        fenwicks[old_depth].update(idx, -1)
                        fenwicks[new_depth].update(idx, 1)
                        depths[idx] = new_depth

            return ans

Sum of Beautiful Subsequences (3671)

To be honest, I don’t understand this. The DP version will pass 80% of the test cases, I most likely will end up resorting to that. The fenwick tree solutions, I don’t even have an intuition for it.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
    # M1: Fenwick Tree solution (correct, passes all)
    class Fenwick:
        def __init__(self, n):
            self.arr = [0] * (n + 1)

        def sum(self, i):
            s = 0
            while i > 0:
                s += self.arr[i]
                i -= i & (-i)
            return s

        def add(self, i, x):
            while i < len(self.arr):
                self.arr[i] += x
                i += i & (-i)

    class Solution:
        def totalBeauty(self, nums: List[int]) -> int:
            MOD = 10**9 + 7
            max_val = max(nums) + 1
            locs = defaultdict(list)
            for idx, val in enumerate(nums):
                locs[val].append(idx)

            F = [0] * max_val
            for d in range(1, max_val):
                # indices of all numbers divisible by d
                indices = sorted(pos for val in range(d, max_val, d) for pos in locs[val])
                if len(indices) <= 1:
                    F[d] = len(indices)
                    continue

                rank = {pos: r for r, pos in enumerate(indices, 1)}  # rank compress
                fenw = Fenwick(len(indices))
                for val in range(d, max_val, d):
                    for pos in reversed(locs[val]):
                        r = rank[pos]
                        new_subseq_count = 1 + fenw.sum(r - 1)
                        F[d] += new_subseq_count
                        fenw.add(r, new_subseq_count)

            for d in range(max_val - 1, 0, -1):
                for multiple in range(d * 2, max_val, d):
                    F[d] -= F[multiple]
                F[d] %= MOD

            return sum((d * F[d]) % MOD for d in range(1, max_val)) % MOD

    # M2::PARTIAL: dp solution:
    from math import gcd
    from collections import defaultdict

    class Solution:
       def totalBeauty(self, nums: List[int]) -> int:
           """
           I think this is a multi step question as well

           First we just find all the subsequences (and the GCDs within them)

           Then we combine via the GCD values and get the beauty scores

           == steps
           {problem decomposition} ==> naive
           1. enum strictly increasing subsequences in the array
           2. for each subsequence, find its gcd
           3. for each GCD, g, count subsequences with gcd = g
           4. calculate

           == key intuition for optimisation:
           1. directly enumerating is costly 2^n, so we have to build up the results.
           2. so we use dp 2D where dp[i][g] is the number of strictly increasing subsequences ending at idx i with gcd g
           """

           MOD = (10 ** 9) + 7
           n = len(nums)
           total_beauty = 0

           dp = [defaultdict(int) for _ in range(n)] # so dp[i][g] gives number of strictly increasing subsequences ending at index i with gcd g.

           for i in range(n):
               dp[i][nums[i]] = 1 # init value, it's gcd 1 for itself
               for j in range(i): # left pointer
                   if nums[j] < nums[i]: # then can accumulate more GCDs
                       for g, count in dp[j].items():
                           new_g = gcd(g, nums[i])
                           new_count = dp[i][new_g] + count
                           dp[i][new_g] = new_count % MOD

               # now we calculate beauty values:
               for g, count in dp[i].items():
                   total_beauty = (total_beauty + (g * count)) % MOD

           return total_beauty

N-Sum Family #

Problems #

N-Sum Family
- 2-sum is about complement finding, pairwise just keep a seen index and use that when searching for complements further down

TODO Sudoku Family #

Sudoku Family
- n-dimensional array access is the main learning point here. When we access blocks, then for each cell, we can get which block they’re from using block_idx = row_idx // 3, col_idx // 3.

TODO Stock Family #

SOE #

General #

Forgetting negatives / zeros in prefix sums \(\implies\) this affects the order of filling / presences of duplicate prefix sums in my list of prefix sums

Jumping the Gun to stdlib thoughts #

Off-by-one in subarray loops
Jumping straight to “pythonic” stdlib uses is NOT good. Just think about the minimal implementation then on the second look at it, think of stdlib uses. Need to avoid cases where we interpret the question wrongly. E.g. for “Top K Frequent Elements (347)”, there should be no reason to use a heapq, we can just directly sort based on counted values.

Disambiguating Sliding window vs prefix/suffix accumulation #

Sliding window vs prefix/suffix accumulation is a common pitfall
- If the problem wants you to process variable subarrays, or is about max-min over all partitions (not just windows that move by 1), sliding window is probably not a fit.

Decision Flow #

Is it lookup/search? → HashMap
Is it range sum? → Prefix sums
Do we care about some properties of subsequences (where the contiguity not important BUT the relative order matters) -> consider using prefix/suffix counting, which may be a state-tracking kind of counting as well.
Is the problem about a ‘window’ (i.e. strict subarray of size K / variable) or is it accumulating something from start / end
- if window: most likely Sliding window
- if accumulation / arbitrary splits then it’s most likely: Prefix/Suffix tracking
Char counting / Freq counting \(\rightarrow\) use fixed array counter if is fixed space else use collections.Counter

Style, Recipes, Boilerplate #

Better to early return instead of more code-golf like solutions
love the use of defaultdict, careful on the choice of keys (must be hashable)
use genexps for intermediate lists where possible