Topic 16: 2D DP

Key-Intuition: Two-dimensional dynamic programming handles problems with sequences compared across two indices, often strings or grids.

Canonical Question Types #

Expect to have some overlaps with DP1 topic.

String DP / Sequence Alignment (LCS, Substring, Palindrome) #

Compare strings for commonalities, optimal matchings

for the string DP family, here’s the neighbour map:

    Interleaving String:  reads ABOVE and LEFT       (two neighbours, no diagonal)
    Edit Distance:        reads ABOVE, LEFT, DIAGONAL (three neighbours)
    LCS:                  reads ABOVE, LEFT, DIAGONAL (three neighbours, max not sum)
    Regex Matching:       reads DIAGONAL, LEFT-LEFT,  ABOVE (three, non-standard)
    Distinct Subseq:      reads ABOVE and DIAGONAL    (two neighbours, no left)

Problems #

Longest Common Subsequence (1143) (a classic) We care about a statistic (length) of the solution, not the actual solution. Also the inputs suggest that it’s something that should be solvable in an \(O(n^{2})\) solution. It involves two strings, so their prefixes are what we should be trying to match.

recursive decomposition

The recursive decomposition is the cleanest entry point.

Compare the last characters of text1[:i] and text2[:j]:

If they match: both characters must be in the LCS (taking them is always at least as good as not taking them). LCS length = 1 + LCS(text1[:i-1], text2[:j-1]).
If they don’t match: at least one of them is not in the LCS. case 1: Try dropping from text1: LCS(text1[:i-1], text2[:j]).
case 2: Try dropping from text2: LCS(text1[:i], text2[:j-1]).
Take the max of the two cases.

This decomposition only refers to strict prefixes of both strings, so the subproblems are smaller and the recursion terminates.

Memoising on (i, j) gives \(O(m*n)\) states each computed in \(O(1)\): that’s the recursive solution.

State definition follows:

dp[i][j] = LCS length between text1[:i] and text2[:j].

Rows index into text1, columns into text2.

Base cases dp[0][*] = dp[*][0] = 0 (empty prefix has LCS 0 with anything).

Recurrence:

     if text1[i-1] == text2[j-1]:  dp[i][j] = 1 + dp[i-1][j-1]
     else:                          dp[i][j] = max(dp[i-1][j], dp[i][j-1])

deriving the state optimisation

Look at what dp[i][j] reads from the 2D table:

Match case: dp[i-1][j-1] — the diagonal cell
Mismatch case:
- dp[i-1][j] — the cell directly above
- dp[i][j-1] — the cell directly to the left

Rolling to one row (dp[j] represents the current row being built):

dp[j-1] is available immediately — just computed, it’s the left cell.
dp[j] before update is the above cell (previous row, same column).
dp[i-1][j-1] — the diagonal — is the tricky one. By the time we compute dp[j], we have already overwritten dp[j-1] with the current row’s value. So dp[i-1][j-1] is gone.
Fix: save dp[j] into a variable prev before overwriting it. After the update, shift prev = old dp[j]. This variable tracks the diagonal value one step behind.

The two-row version (prev_row, curr_row) avoids needing this prev-fix by keeping the old row intact. It’s easier to derive but uses 2x the space. The one-row version with prev is the optimal form — derive the two-row version first if you’re unsure, then collapse it.

This just deals with prefixes and we build up to the eventual strings that we are comparing. It’s definitely 2D because of the 2 Strings and dp[i][j] is the LCS between strings text1[:i] and text2[:j] and our answer will be at dp[len(text1) + 1][len(text2) + 1]. We can do this recursively or iteratively and the iterative solution also has a space-optimised 1D rolling approach we can take.

Recursive solution:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
   # M1: recursive solution:
   from functools import cache
   class Solution:
       def longestCommonSubsequence(self, text1: str, text2: str) -> int:

           @cache
           def helper(first_idx, second_idx):
               if first_idx < 0 or second_idx < 0:
                   return 0

               is_last_same = text1[first_idx] == text2[second_idx]
               if is_last_same:
                   return 1 + helper(first_idx - 1, second_idx - 1)
               else:
                   return max(helper(first_idx - 1, second_idx), helper(first_idx, second_idx - 1))

           return helper(len(text1) - 1, len(text2) - 1)

Iterative solution: table filling can be evolved from a 2D solution into a 1D rolled-up one

basic, 2D table:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
      class Solution:
          def longestCommonSubsequence(self, text1: str, text2: str) -> int:
          # dp[i][j] -> LCS between strings text1[:i] and text2[:j]
          # answer will be at dp[len(text1)][len(text2)]
          # dp = [[[0] * (len(text1) + 1)] for j in range(len(text2) + 1)]
          dp = [[0] * (len(text2) + 1) for _ in range(len(text1) + 1)]


          for i in range(len(text1)):
              for j in range(len(text2)):
                  # is same letter:
                  if text1[i] == text2[j]:
                      dp[i + 1][j + 1] = 1 + dp[i][j]

                  else: # get max of adjacent builds
                      dp[i + 1][j + 1] = max(dp[i][j + 1], dp[i + 1][j])

          return dp[len(text1)][len(text2)]
          # this shows us how we could have just used a rolling 1D array because of the state transitions that we see above.

space optimisation: we can do a rollup by looking at how the state transitions happen, here’s a 2-array approach at the roll-up

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
      # M3: not perfect but 2 1D arrays for a 2D DP solution:
      class Solution:
          def longestCommonSubsequence(self, text1: str, text2: str) -> int:
              m, n = len(text1), len(text2)
              # Use only two rows
              prev = [0] * (n + 1)
              curr = [0] * (n + 1)
              for i in range(1, m + 1):
                  for j in range(1, n + 1):
                      if text1[i - 1] == text2[j - 1]:
                          curr[j] = prev[j - 1] + 1
                      else:
                          curr[j] = max(prev[j], curr[j - 1])
                  prev, curr = curr, prev  # Roll rows (no need to clear curr due to overwrite)
              return prev[n]

but a better way to do the optimisation would have been to have a single 1D array instead, we just need a single history variable, prev for it:
- 1 + dp[i][j] : prev value incremented by 1 \(\implies\) we can just store a prev var to store it
- max(dp[i][j + 1], dp[i + 1][j])

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
   # M2: most efficient, space optimised bottom up approach using a 1D rolling array
   class Solution:
       def longestCommonSubsequence(self, text1: str, text2: str) -> int:
           m, n = len(text1), len(text2)
           dp = [0] * (n + 1)  # dp[j] = LCS length for text1 prefix up to i and text2 prefix up to j

           for i in range(1, m + 1):
               prev = 0  # This will hold dp[j-1] from previous iteration (the previous diagonal)
               for j in range(1, n + 1):
                   temp = dp[j]  # Save current dp[j] which will become prev in next iteration
                   # case 1: the two are the same characters => both of them should be in
                   if text1[i - 1] == text2[j - 1]:
                       dp[j] = prev + 1
                   # case 2: the two are different, we either include this or exclude this, whichever is better
                   else:
                       dp[j] = max(dp[j], dp[j - 1])
                   prev = temp  # Update prev for next j
           return dp[n]

The space optimisations are nifty but we can pass test-cases without it, the 2D DP approach is sufficient.

Disambiguating this from LIS

LIS asks: within a single array, find the longest subsequence where values are strictly increasing. The constraint is on the values (monotone ordering).

LCS asks: across two strings, find the longest subsequence present in both. The constraint is on identity (same characters in same relative order). No ordering constraint on the values themselves.

The algorithms look superficially similar (both are 1D or 2D DP on indices) but the recurrences are different:

the state-transition:
- LIS has a transition that checks ordering (nums[i] > nums[j]);
- LCS has a transition that checks character identity (text1[i] == text2[j]).
existence of a sub-quadratic general solution
- LIS is also solvable in O(n log n) via patience sorting;
- LCS has no known sub-quadratic general solution.

extensions – e.g. tracing the actual answer

what changes if the problem asks for the actual LCS string rather than its length.

The answer: The DP table is the same, but you reconstruct by backtracking from dp[m][n] — at each cell, if the characters matched you go diagonal and prepend the character; if they didn’t match you go to whichever neighbour is larger.

This is a natural extension and worth a three-line note since LC variants (like LC 1092 Shortest Common Supersequence) require it.

Edit Distance (72)

Recognition signals:
- two strings, minimise cost of some edit operations
- operations defined as insert/delete/replace or subset thereof
- “convert”, “transform”, “minimum operations” language
Anti-signals:
- counting (not minimising) → different recurrence shape
- single string → likely 1D DP or interval DP
- subsequence (not full string) → LCS family not edit distance
speed layer:
- “I define dp[i][j] as the minimum edits to convert word1[:i] to word2[:j]. Each cell reads from three neighbours: left (insert), above (delete), diagonal (replace or free match). Base cases: first row and column are 0 to n deletions/insertions. Fill left-to-right, top-to-bottom. Answer at dp[m][n].”
- the state definition and filling order are fast once you’ve internalised the neighbour mapping. The implementation is boilerplate from LCS. The only slowdown risk is confusing which neighbour is insert vs delete.

We have a fixed set of operations that we can do (insert, delete, replace), with strings it’s always a good idea to just consider building up from smaller subproblems and the smaller subproblems is usually defined as string prefixes (and we build them up). This naturally gives a 2D table, which we can then look for space optimisations for.

Since we have two strings to compare, their prefixes end up being 2 dimensions hence we start off with a 2-D table.

Arriving at this intuition

question analysis:
- we care about minimising the number of ops – it’s an optimisation problem
- we’re working with 2 strings, so immedidately we consider their prefixes. 2 strings – 2 dimensions to consider first
- feels like 2D DP table to fill – with \(n \le 500\), 2D table seems reasonable of a first-attempt
DP table meaning:
- dp[i][j] min number of ops to go from word1[:i] to word2[:j].
- we do 1-indexed, so our indices will go till (m + 1) and (n + 1). This is because they are exclusive slice-params (pythonic)
- base cases:
  - all the [i][0] cells will just be i because it’s about just deletions from word1
  - all the [0][j] cells will just j because it’s just additions
- actual recurrence:
  - case 1: if characters are the same, then copy over from the diagonal
  - case 2: consider the 3 operations and figure out where to build it from
    1. insertion
    2. deletion
    3. replacement

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
      # without space optimsation, 2D table:
      class Solution:
          def minDistance(self, word1: str, word2: str) -> int:
              n, m = len(word1), len(word2)

              # dp[i][j] shows min number of operations to use to convert word1[:i] to word2[:j]
              dp = [[0] * (m + 1) for _ in range(n + 1)]

              for i in range(n + 1):
                  dp[i][0] = i # delete all from word1 to get empty word2

              for j in range(m + 1) :
                  dp[0][j] = j # insert all to convert empty word1 to word2[:j]

              for i in range(1, n + 1): # because these are exclusive slice params
                  for j in range(1, m + 1):
                      if word1[i - 1] == word2[j - 1]:
                          dp[i][j] = dp[i - 1][j - 1]
                      else:
                          insertion = dp[i][j - 1] + 1
                          deletion = dp[i - 1][j] + 1
                          replacement = dp[i - 1][j - 1] + 1
                          dp[i][j] = min(insertion, deletion, replacement)

              return dp[n][m]

On inspecting the filling order, which is apparent from the recurrence, we realise that we can just keep a snapshot row and a working row (so 2 1-D arrays) instead of the table and this roll-up would be an effective space-optimisation.

understanding the state derivations and filling-order for the 2D approach

The way to make the recurrence stick is to think operationally about what each cell represents. dp[i][j] is the minimum cost to convert word1[:i] into word2[:j]. Now you’re at position (i, j) and you want to fill it.

What’s your last operation? Table-filling intuition: how we map from the operations to the table filling data dependencies:

You insert word2[j-1] at the end of your working string. Before this insert, you had converted word1[:i] into word2[:j-1].
Cost: dp[i][j-1] + 1
You delete word1[i-1]. Before this delete, you had converted word1[:i-1] into word2[:j].
Cost: dp[i-1][j] + 1.
You replace word1[i-1] with word2[j-1]. Before this replace, you had converted word1[:i-1] into word2[:j-1].
Cost: dp[i-1][j-1] + 1.
If word1[i-1] = word2[j-1]=: no operation needed at this position.
Cost: dp[i-1][j-1] (the diagonal, free step).

intuiting this space roll-up optimisation: keeping a snapshot row

The idea for the space optimisation comes from the fact that we just need to keep track of 2 rows from our state transitions that we observe – pretty much a snapshot approach.

it’s from the recurrence relation from this part:

      insert:  come from the LEFT    dp[i][j-1]
      delete:  come from ABOVE       dp[i-1][j]
      replace: come from DIAGONAL    dp[i-1][j-1]
      match:   come from DIAGONAL    dp[i-1][j-1]  (free)

so we see how we just need two rows that we can keep alternating between building and reference row (that’s why we have the swapping done below). Also note that the implementation is cleaner because we don’t need explicit base case table-filling.

so we get:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
      # space optimisations with 2 rolling rows:
      class Solution:
          def minDistance(self, word1: str, word2: str) -> int:
              n, m = len(word1), len(word2)
              if n < m:
                  # ensure n >= m to use less space
                  return self.minDistance(word2, word1)

              previous_row = list(range(m + 1))
              current_row = [0] * (m + 1)

              for i in range(1, n + 1):
                  current_row[0] = i
                  for j in range(1, m + 1):
                      if word1[i - 1] == word2[j - 1]:
                          current_row[j] = previous_row[j - 1]
                      else:
                          insert_op = current_row[j - 1] + 1
                          delete_op = previous_row[j] + 1
                          replace_op = previous_row[j - 1] + 1
                          current_row[j] = min(insert_op, delete_op, replace_op)

                  # swap them
                  previous_row, current_row = current_row, previous_row


              return previous_row[m]

why it’s valid to do the swapping trick

the if n < m: swap trick in the space-optimised version is correct and elegant

we’re ensuring the shorter string is the one that defines the column dimension, minimising the array size.

This is valid because: edit distance is symmetric (minDistance(a, b) = minDistance(b, a)=), so the swap is safe.

problem transformations

Delete-only variant (LC 583 Delete Operation for Two Strings):
you can only insert/delete, not replace. Equivalent to: len(word1) + len(word2) - 2 * LCS(word1, word2).
The insight: the optimal strategy is to keep the LCS and delete everything else from both strings.
This connects edit distance directly to LCS — they’re dual views of the same state space.
One operation only:
if you restrict to only insertions (or only deletions), the problem reduces to LCS-based counting.
This is a useful mutation to know because it shows that LCS is embedded inside edit distance.
What breaks with large strings: \(O(mn)\) space for the 2D table.
The two-row rolling you already have solves this.
Further: can you get O(min(m,n)) space with one row?
Yes — the same single-row trick from LCS applies here, but you need a prev variable for the diagonal exactly as in LCS. The pattern is identical.

Regular Expression Matching (10)

Recognition signals:
- two strings, one is a pattern with special semantics
- boolean output (match or not)
- special character can match zero occurrences (forces DP, kills greedy)
Anti-signals:
- both strings are plain text → LCS or edit distance family
- counting matches (not boolean) → different recurrence shape
- only ‘.’ wildcard, no ‘*’ → trivial scan, no DP needed
speed layer:
- recognition heuristic: two strings, one is a regex-style pattern, boolean output → string alignment DP. The * character → you need the two-case split. That’s the 30-second tell.
- comms: “I define dp[i][j] as whether p[:j] matches s[:i]. For plain characters or ., I *advance both indices *— it’s the diagonal. For x*, two cases: zero occurrences skips the pair back by 2; one-or-more consumes s[i-1] and stays at j in the pattern, reading from above. Base case handles patterns that collapse to match the empty string via consecutive x* pairs.”
- INTERNALISE: state definition is fast, but the * one-or-more transition (stay at j, read above) needs to be internalised, not reconstructed.

Intuiting the solution

analysis:
- we output a boolean for satisfiability, whether it’s a full match or not
- we’re dealing with 2 strings, one is a pattern and one is a string – this means we should consider substring prefixes and build our way to the final string
- input constraints look generous, with n = 20 and possible 2D table-filling to be reasonable still
- implicit rule:
  - * never appears as the first character of a valid pattern (per problem constraints). Our code implicitly relies on this — p[j-2] would be index -1 if j=1 and p[0]='*'. We don’t guard against it because the constraints guarantee it won’t happen.
Understanding the recurrence cases: the operations guide our visualisation of the recurrence
- .: i.e. if p[j - 1] = s[i - 1] or p[j - 1] = '.' state change: dp[i][j] = dp[i - 1][j - 1]
- *: there’s 2 cases for this:
  1. it could be a zero occurrence:
    state change: dp[i][j] = dp[i][j - 2]
  2. it could be a one or more occurrence: i.e. if p[j - 2] = s[i - 1] or p[j - 2] = '
    state change: dp[i][j] = dp[i][j] or dp[i - 1][j]
    1. the dp[i - 1][j] is non-obvious: it means “pattern p[:j] matches s[:i-1]” which is the same pattern position, one fewer string character. If that’s true, and p[j - 2] matches s[i - 1] then the * is “consuming” s[i - 1] as one more match. WE don’t move the pattern index because x* can still match further characters to the right of s[i - 1].
      this is the key learning: staying at j allows the * to match arbitrarily many characters by repeatedly consuming from s without advancing in p.
- neighbour map:
```
        plain match / '.':   read from DIAGONAL    dp[i-1][j-1]
        '*' zero case:       read from LEFT-LEFT   dp[i][j-2]
        '*' one+ case:       read from ABOVE       dp[i-1][j]
```

base cases: the base cases are such:
```
      dp[0][0] = True # empty pattern will match empty string
      # NOTE: these "trivial" inputs are important. --> j is the pattern idx
      # inits for empty string and patterns like a*, a*b*, and so on
      for j in range(2, m + 1):
          if p[j - 1] == '*' and dp[0][j - 2]:
              dp[0][j] = True
```
- dp[0][j] asks “does p[:j] match the empty string?”
  Patterns like a*, a*b*, .* can match empty strings by using their * to mean zero occurrences. The condition dp[0][j-2] checks that the pattern up to two positions back (the character before ) also matched empty — meaning the entire left portion can be zero-collapsed. Why j-2 and not j-1? Because * always appears with its preceding character as a pair (x*), so skipping x* means *going back 2 positions, not 1. The pattern structure forces pairs.

Ruling out canonical candidates – why not greedy

Canonical candidates worth ruling out: greedy/two-pointer is tempting because regex matching feels like a scan.

It fails immediately when * appears — * can match zero occurrences, which means you might need to “skip” pattern characters without consuming string characters.

That lookahead-or-skip structure kills any greedy approach beacuse we need to consider all substructures, can’t make local decisions, forgoing future substructures. DP wins because every dp[i][j] is computed once and the overlapping subproblem structure is clear.

The * character is the entire difficulty. Without it, this is just a character-by-character scan (boring). With it, you get the two-case split that defines the whole recurrence.

the * cases, from first-principles

Here’s our decision-tree:

You are at dp[i][j] and p[j-1] = ‘*’=.

The pair x* means “zero or more occurrences of x.”

Case A — zero occurrences: pretend x* doesn’t exist.
This means p[:j] matching s[:i] is equivalent to p[:j-2] matching s[:i].
State Transition: dp[i][j] = dp[i][j-2]
Case B — one or more occurrences:
key question that unlocks Case B:
“why don’t I advance j?”
Answer:
because x* can still eat more characters. Staying at j lets the DP naturally handle two-occurrences, three-occurrences, etc. through repeated application of Case B as i decreases.
x* is consuming s[i-1].
For this to be valid, x must match s[i-1] (either x == s[i-1] or x == '.').
If valid, does p[:j] match s[:i]?
\(\implies\) does p[:j] match s[:i-1]? (same pattern, one fewer string char consumed)
\(\implies\) dp[i-1][j]
State Transition: dp[i][j] |= dp[i-1][j]

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
   class Solution:
       def isMatch(self, s, p):
           n, m = len(s), len(p)
           dp = [[False] * (m + 1) for _ in range(n + 1)]
           dp[0][0] = True

           # base case: pattern matches empty string only via x* pairs collapsing to zero
           for j in range(2, m + 1):
               if p[j-1] == '*' and dp[0][j-2]:  # x* collapses, check left portion also collapses
                   dp[0][j] = True

           for i in range(1, n + 1):
               for j in range(1, m + 1):
                   if p[j-1] == '.' or p[j-1] == s[i-1]:
                       dp[i][j] = dp[i-1][j-1]          # match: advance both
                   elif p[j-1] == '*':
                       dp[i][j] = dp[i][j-2]             # zero occurrences: skip x*
                       if p[j-2] == '.' or p[j-2] == s[i-1]:
                           dp[i][j] |= dp[i-1][j]        # one+ occurrences: consume s[i-1], stay at j
                   # else: mismatch, dp[i][j] stays False
           return dp[n][m]

problem extensions

TODO Wildcard Matching (LC 44): ? matches exactly one character (like .), * matches any sequence of characters including empty.
Simpler than LC 10 because * in LC 44 is standalone (not attached to a preceding character).
The zero-or-more transition becomes dp[i][j] |= dp[i-1][j] | dp[i][j-1] — * can expand into s or stay put. No j-2 jump because there’s no paired character.
What breaks with ** in the pattern:
the problem constraints disallow this (no two consecutive *).
If allowed, p[j-2] could itself be *, and the zero-occurrence case would still work but the one-or-more case would need guarding. The constraint is load-bearing.
Counting all matches instead of boolean:
replace |= with += and the False/True initialisation with 0/1. The recurrence shape is identical. This is how Distinct Subsequences (LC 115) relates — it’s a counting variant of the same two-string prefix DP with directional transitions.

Interleaving String (97)

Recognition signals:

three strings: two sources, one target
“interleave”, “merge”, “combine maintaining relative order”
boolean output
len(s1) + len(s2) == len(s3) check is the first guard

Anti-signals:

counting interleavings (not boolean) → same recurrence, replace boolean OR with integer addition
single source → not interleaving, different family
subsequence (not interleaving) → LCS / distinct subseq family

Analysis

Key insight that helps derive the recurrence:
- if s3[:i+j] is a valid interleaving of s1[:i] and s2[:j], then the last character of s3[:i+j] came from either s1[i-1] or s2[j-1].
- That binary choice is exhaustive — no other option exists given the interleaving constraint. The recurrence falls out immediately.

We need to look at prefix strings and ask if they can be used to form the prefix string for target. So dp[i][j] asks: “Can I form the prefix of s3 of length i + j using prefixes of s1 and s2 of lengths i and j respectively?” And we build small to big. Space optimisations are obvious when we consider the state transition equation and we realise that we can use a rolling 1D array.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
      # M1: 2D DP, no space optimisation
      class Solution:
          def isInterleave(self, s1: str, s2: str, s3: str) -> bool:
              m, n = len(s1), len(s2)
              if (m + n) != len(s3):
                  return False

              dp = [[False] * (n + 1) for _ in range(m + 1)]
              dp[0][0] = True

              # init first column (only s1):
              for i in range(1, m + 1):
                  dp[i][0] = dp[i - 1][0] and s1[i - 1] == s3[i - 1]

              # init first row (only s2):
              for j in range(1, n + 1):
                  dp[0][j] = dp[0][j - 1] and s2[j - 1] == s3[j - 1]

              # fill the rest:
              for i in range(1, m + 1):
                  for j in range(1, n + 1):
                      dp[i][j] = (
                          dp[i - 1][j]
                          and s1[i - 1] == s3[i + j - 1]
                      ) or (
                          dp[i][j - 1]
                          and s2[j - 1] == s3[i + j - 1]
                      )

              return dp[m][n]

base case derivation

The base cases follow from the same “last character came from s1 or s2” logic, degenerated:

First column (j=0): only s1 is available. dp[i][0] is true iff dp[i-1][0] is true and s1[i-1] = s3[i-1]=. The chain must hold all the way from the top — one mismatch anywhere kills it. This is correctly implemented.

First row (i=0): symmetric, only s2 available.

The key observation:

The base case initialisation is just the recurrence applied with one dimension fixed at zero. They’re the degenerate version of the main recurrence. This is worth noting because it means we can derive base cases on the fly rather than recalling them.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
      # M2: Space optimised, 1D rolling array
      class Solution:
          def isInterleave(self, s1: str, s2: str, s3: str) -> bool:
              m, n = len(s1), len(s2)
              if (m + n) != len(s3):
                  return False

              dp = [False] * (n + 1) # in terms of s2
              # trivial fill: nothingness is trivially true
              dp[0] = True

              # trivial fill first row, init first row values:
              # round 1 fill first row
              for j in range(1, n + 1):
                  dp[j] = dp[j - 1] and s2[j - 1] == s3[j - 1]

              # for prefixes of s1, this is what we are doing inclusions on
              for i in range(1, m + 1):

                  # updates dp[0] for column zero (only s1 to be used as prefix to form s3)
                  dp[0] = dp[0] and s1[i - 1] == s3[i - 1]

                  for j in range(1, n + 1):
                      dp[j] = (
                              ( # case A: last letter in s1 matches last letter in s3
                                  dp[j]
                                  and s1[i - 1] == s3[i + j - 1]
                              )
                              or
                              ( # case B: last letter in s2 matches last letter in s3
                                  dp[j - 1]
                                  and s2[j - 1] == s3[i + j - 1]
                              )
                      )

              return dp[n]

The flattening logic is important to understand:

from v1, we notice that dp[i][j] is caculated as such:
1
2
3
4
5
6
7
          dp[i][j] = (
              dp[i - 1][j]  # prev one matched
              and s1[i - 1] == s3[i + j - 1] # matching char
          ) or (
              dp[i][j - 1]  # prev one matched
              and s2[j - 1] == s3[i + j - 1] # maching char
          )
So it only depends on:
previous row, same column (we filled top to bottom)
same row, previous column (we fill left to right)
So we can stick to the inner loop being left to right.
Important: When updating dp, iterate over j from left to right (0 to n) because dp[j] corresponds to dp[i][j] and previous state dp[j-1] is still available in this iteration (the current fella).
So in the flattened version:
dp[j] in the inner loop corresponds to dp[i][j] in the 2D version.
Before the inner loop starts, dp[j] holds values from the previous row (i-1).
We update dp[j] using those old values and dp[j-1] (from current row so far).
This update order and rolling over 1D dp ensure the logic is preserved exactly.

problem extensions

Count distinct interleavings instead of boolean: replace or with + and True/False with 0/1. dp[i][j] becomes the number of ways to form s3[:i+j]. Base case dp[0][0] = 1. The rest is identical.
What if relative order within each source doesn’t need to be preserved? Then it degenerates to a simple character count check — just verify s1 + s2 has the same character frequencies as s3. O(n) no DP needed. Recognising this degeneration is a good mutation to have in your head.
What if there are k source strings instead of 2? The state space becomes k-dimensional and the recurrence has k cases. For k=3 you’d have dp[i][j][k] with a similar neighbour structure. The length guard becomes len(s1) + len(s2) + ... = len(target)=.

Interval DP and Range Queries 💪 #

Subarray interval selection/chaining for optimal costs
Recognition signals:
- property of a substring (contiguous) is recursively defined on strictly smaller substrings
- need to answer substring-property queries later (not just once)
- “palindrome”, “valid”, “balanced”, “partitionable” applied to spans
Anti-signals:
- only need one answer, no future queries → expand-center or greedy
- subsequence (non-contiguous) → different DP family (LCS, LIS)
- the property isn’t recursively decomposable on inner substrings
Interval DP filling order matters, there’s a common insight on filling order.
- typically we do the outer iteration using increasing order of the window size (fill the smaller window first) – it’s not the table row/cols that we prioritise, it’s the window size we prioritise by choosing the filling order.

Problems #

Longest Palindromic Substring (5)

Pattern Recognition and Extraction

This is an interval DP question Recognition signals:

property of a substring (contiguous) is recursively defined on strictly smaller substrings
need to answer substring-property queries later (not just once)
“palindrome”, “valid”, “balanced”, “partitionable” applied to spans

Anti-signals:

only need one answer, no future queries → expand-center or greedy
subsequence (non-contiguous) → different DP family (LCS, LIS)
the property isn’t recursively decomposable on inner substrings

walk-through of intuition for the basic Interval 2D DP

analysis
1. we care about contiguous regions here AND they must be palindromic
  - out mind typically goes to start and end markers / window for contiguous spans
  - we also may try to consider just that palindromes have a center (even- and odd-lengthed palindromes, both have centers)
We determine what the substructure looks like based on the problem’s charateristics
- palindrome = <x = s[i]><palindrome><x = s[j]> ==> this is what my dp table needs to help me determine
- trivial cases for palindrome:
  1. when window size = 1 ==> it’s a palindrome
  2. when window size = 2 ==> check if s[i] = s[i + 1] for palindrome
now we figure out how to build the answer:
1. we need to know if s[i + 1] and s[j] is palindrome if s[i] == s[j]
2. to make this decision, we just need to store booleans
3. so, dp[i][j] answers (is start_idx = i, end_idx = j) a palindrome?
now we figure out how to get the answer:
- we just need to know what’s the max window size and corresponding start_idx for it.
- everytime it’s a valid True entry that we are writing:
  1. we can use the corresponding i as a start_idx
  2. we can update the max_window_size
- so the answer would just be: s[start: start + window_size]

Optimal solutions for this

interval DP – better if we need to reuse the boolean table as a reusable artefact

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
        # Interval DP — use when you need the boolean table as a reusable artifact
        def longestPalindrome(self, s):
            n = len(s)
            dp = [[False] * n for _ in range(n)]
            start, max_len = 0, 1
            for i in range(n):
                dp[i][i] = True
            for length in range(2, n + 1):
                for i in range(n - length + 1):
                    j = i + length - 1
                    if s[i] == s[j] and (length == 2 or dp[i+1][j-1]):
                        dp[i][j] = True
                        if length > max_len:
                            start, max_len = i, length
            return s[start:start + max_len]

Key implementation details

filling order matters here. We fill by increasing order of the window size. dp[i][j] reads dp[i+1][j-1], which is the inner substring of length window_size - 2.
- If you fill by increasing length, the inner substring has already been computed when you need it.
- If you fill by row or column (the natural 2D iteration), you’d read dp[i+1][j-1] before it’s filled.
  This is the canonical “interval DP filling order” insight and it’s worth making explicit because it applies to every problem in this family.

expand-center approach – \(O(n²)\) time, \(O(1)\) space faster, space-efficient use for the longest palindrome

core insight for expand-center

Every palindrome has exactly one center (a single character for odd length, a gap between two characters for even length).

Trying all \(n + (n-1) = 2^{n}-1\) centers and expanding outward while characters match gives you every palindrome.

\(O(n²)\) time, \(O(1)\) space.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
     # Expand-center — use when only the longest palindrome is needed
     def longestPalindrome(self, s):
         n = len(s)
         start, max_len = 0, 1
         def expand(l, r):
             while l >= 0 and r < n and s[l] == s[r]:
                 l -= 1; r += 1
             return l + 1, r - 1
         for i in range(n):
             for l, r in [expand(i, i), expand(i, i+1)]:
                 if r - l + 1 > max_len:
                     start, max_len = l, r - l + 1
         return s[start:start + max_len]

TODO Palindromic Substrings (647)
TODO 132: Palindrome partitioning

Burst Balloons (312)

This one was wild, probably because I hadn’t seen many interval DP questions prior to this. The rough idea is to think of complements: we try to see what should be the Last Balloon that should be burst within a particular interval. That should help us build up to an answer because then we manage to split the problem into left and right subintervals.

DP setup: dp[i][j] is the max number of coins obtained by bursting the balloons between idx i and j so nums[i + 1] to nums[j + 1], so we get the recurrence that we if we consider k as the best choice for the last to burst balloon, then we get the coins: dp[i][k] + dp[k][j] + (nums[i] * nums[k] * nums[j]) wherein dp[i][k] and dp[k][j] are the left and right subintervals’ answers. The table filling needs to happen from small to large intervals.

Some tricks: edge-padding is useful.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
    class Solution:
        def maxCoins(self, nums: List[int]) -> int:
            # apply a dp solution to count this.
            # insight: we consider contiguous slices using i, j. consider kth balloon as i < k < j so i burst within this rand and k is the LAST to be burst

            # NOTE: we do the edge padding to make life easy: TRICK: sentinel values get used as padding.
            nums = [1] + nums + [1]
            n = len(nums)

            # dp[i][j] is the max number of coins to get if we burst balloons ONLY within i, j range => nums[i + 1] to nums[j + 1]
            dp = [[0] * n for _ in range(n)]
            # answer will be at dp[0][n - 1]

            # trivial fills:
            # 2-fers have no answer because there's no middle, but we don't need to do anything there, it's already 0

            # start table fills. ORder matters, we proxy it using length (which goes from 3 to n)
            for length in range(2, n): # remember this is 0-indexed
                rightmost_idx = n - length - 1

                for left in range(rightmost_idx + 1):
                    right = left + length

                    # now we find ussing ks where k is the last pop index:
                    for k in range(left + 1, right):
                        coins = (
                            dp[left][k] # left interval
​                            + dp[k][right] # right interval
​                            + (nums[left] * nums[right] * nums[k]) # this burst
                        )
                        dp[left][right] = max(dp[left][right], coins) # keep max

            return dp[0][n - 1]

LC 1039: Polygon Triangulation

Subsequence Counting & Pattern Matching #

Counting how many ways a pattern appears as (possibly non-contiguous) subsequence

Problems #

Distinct Subsequences (115)

Question analysis, pattern matching

basic analysis:
- We have a source and target string, and we care about subsequences: relative order matters but no need to be contiguous.
- Also it’s a counting problem
- pedestrian approach: do prefix-string counting – 2D DP with dp[i][j] = counts of matches for s[:i] to t[:j], then we look for space optimisations
Patterns / Recognition Signals
- “how many ways” / “count distinct” + subsequence language
- source string, fixed target, relative order preserved
- counting, not optimising
anti-patterns / Anti-signals:
- longest (not count) → LCS family
  - contiguous → substring DP, not subsequence
  - boolean (exists or not) → regex matching family
canonicals class:
- this question is a little asymetric from it’s related ones (10:regex matching and 72 : edit distance) because we go from s to t. We don’t get to skip characters within t and that’s what influences our filling order for the 2D table – and also gives us inspiration to work from right to left (like 1/0 knapsack problem) to avoid overwriting.

pedestrian 2D-dp solution

We can initially think of this as a 2 pointer region and counting problem which would give us a 2D table (dp[i][j] is number of ways to make t[:j] from s[:i] (prefixes)), and then look for space optimisations.

My initial approach shown below has some issues:

layout issue:

the convention is to use rows = source and columns = target ! so my working version had the axes transposed.
this is actually alright – it’s cosmetic but it can cause confusion when we’re narrating it
fix: standardise it to rows = s prefixes and cols = t prefixes. So dp[i][j] is ways to form t[:j] from s[:i], rows go outer loop over s columns go inner loop over t

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
       # 2D-DP, no space optimisation
       class Solution:
           def numDistinct(self, s: str, t: str) -> int:
               n, m = len(s), len(t)

               if n < m: # trivial
                   return 0

               # dp[i][j] shall represent the number of ways to make t[:j] from s[:i]
               dp = [[0] * (n + 1) for _ in range(m + 1)]

               # all the 0 length can be fromed from any prefix of s
               for j in range(n + 1):
                   dp[0][j] = 1

               # for the target:
               for i in range(1, m + 1): # building up the "target window"
                   # for the source
                   for j in range(1, n + 1):
                       if s[j - 1] == t[i - 1]:
                           # on last char match, we take:
                           # A: ways without current char in s (dp[i][j-1])
                           # B: ways using the current matching char (dp[i-1][j-1])
                           dp[i][j] = dp[i][j - 1] +  \
                                      dp[i - 1][ j - 1] # ans to the prev letter
                       else:
                           dp[i][j] = dp[i][j - 1]

               return dp[m][n]

Intuition for the space roll-up and the filling order

NOTE: remember this derivation is done on the transposed version that is confusing (my initial approach). In the future we will stick to rows = source index and cols = target index to make things simpler to reason with

When filling dp[i][j], we only ever reference:

dp[i][j-1] → current row, previous column

dp[i-1][j-1] → previous row, previous column

the recurrence relation is really \[ dp[i][j] = \underbrace{dp[i][j-1]}_{\text{ignore } s[j-1]} + \underbrace{dp[i-1][j-1]}_{\text{use } s[j-1] \text{ to match } t[i-1]} \]

That means:

Column j depends only on column j-1 in this iteration (row i) and from the previous iteration (row i-1).

We can keep just one row and update it in place if we go backwards in i to avoid overwriting dp[i-1] before it’s used.

deriving the recurrence relation from a more traditional 2D table

The explanation above did the transposed version (row idx – target and col – source) which is atypical. Here’s the explanation of how we derive the rolled up 1D array solution a with the correct table orientation (that makes it easier to see).

dp[i][j] = ways to form t[:j] from s[:i]. Consider the last character of s[:i], which is s[i-1]:

Exclude s[i-1] from the subsequence:
- ways = dp[i-1][j]. The target hasn’t changed, we just reduced s.
Include s[i-1] to match t[j-1]: only valid if s[i-1] = t[j-1]=.
- Ways = dp[i-1][j-1]. We’ve matched one more target character using this s character.

Total when characters match: dp[i-1][j] + dp[i-1][j-1]. When characters don’t match: only exclusion is valid: dp[i-1][j].

Notice it reads from dp[i-1][j] and dp[i-1][j-1] — both from the previous row – that’s why we can rollup to 1D DP-array

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
    # space-optimised
    class Solution:
        def numDistinct(self, s, t):
           n, m = len(s), len(t)
           if n < m:
               return 0
           dp = [0] * (m + 1)
           dp[0] = 1           # one way to form empty target from any prefix of s
           for ch in s:        # outer: s characters (rolling over the s-prefix dimension)
               for j in range(m, 0, -1):   # inner: backward over t (0/1 knapsack: each s char used once)
                   if t[j-1] == ch:
                       dp[j] += dp[j-1]   # exclude ch: dp[j] unchanged; include ch: add dp[j-1]
           return dp[m]

NOTE: the link to 0/1 Knapsack Problem is one of the most important learnings here.

Problem Transformations

What if we want the actual subsequences, not just the count?
The DP table gives us counts; reconstruction requires backtracking from dp[n][m] — at each cell, if characters matched, you can go diagonal (dp[i-1][j-1]) or up (dp[i-1][j]); if they didn’t match, only up.
This is analogous to LCS reconstruction.
What if s and t can have duplicates and you want distinct subsequences of s itself (not matching a target)?
LC 940 Number of Distinct Subsequences II — different problem, but same counting mindset. The recurrence becomes about suffix character contributions and requires deduplication.
What breaks with large counts: values can exceed 64-bit integers for long strings.
The problem says return an unsigned 32-bit int, so Python’s arbitrary precision handles it silently. In other languages you’d need modular arithmetic.

Grid DP for Pathfinding and Counting #

Grid traversal, minimum/maximum cost, and counting approaches

Problems #

Unique Paths (62)

This is a classic grid traversal, so the boundaries of the grid give us trivial cases for which we can pre-fill our DP table. The rest of it is just adhering to the rules of traversal that is presented to us. In this case, we just need the number of paths, so we could do the math hack or we could fill thigns in and just take the number of paths. For the DP approach, after we define the naive DP (where dp[i][j] will contain the number of unique ways to get to cell (i, j)), we can observe that the state transition only relies on the cell above and the cell to the left, so we can just keep a 1D array for this and get a whole dimension-optimisation going.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
    # Row-Rolling Space (1D array) optimised DP for 2D DP solution
    class Solution:
        def uniquePaths(self, m: int, n: int) -> int:
            dp = [1] * n  # Initialize for the first row
            for _ in range(1, m):  # For each further row
                for j in range(1, n):
                    dp[j] += dp[j - 1]  # Current cell = left + above (dp[j] is above, dp[j-1] is left)
            return dp[-1]

    # M3: math hack:
    import math

    class Solution:
        def uniquePaths(self, m: int, n: int) -> int:
            return math.comb(m + n - 2, m - 1)

The build up is necessary that’s why we iterate j from left to right direction. Also, the grid can be viewed as a directed acyclic graph (DAG) where each cell points to its right and down neighbors. The number of unique paths is then the count of paths from the top-left node to the bottom-right node. The DP solution is effectively counting paths in this DAG, which confirms the validity of your approach.

The math hack is because \[ \text{Number of Paths} = \binom{m+n-2}{m-1} = \binom{m+n-2}{n-1} \]

Minimum Path Sum (64)

Advanced Grid DP with Obstacles/State #

Variations on grid DP with additional state (obstacles)

Problems #

Knapsack Variants (2D and above) #

Multi-resource or multi-choice formations (classic, bounded, unbounded)

we can summarise the family of knapsack problems like so:

Variant	Unlimited Supply?	DP Update Pattern	Example Problem
1. 0/1 Knapsack	Only once	for each `num` in `nums`, for `t` from target to num: `dp[t]	= dp[t-num]`	Subset sum; Partition problem (416)
2. Unbounded Knapsack	Infinite times	for each `num`, for `t` from num to target: `dp[t]	= dp[t-num]`	Coin change infinite supply
3. Bounded Knapsack	Up to limited times	“Decompose” item frequency with binary trick or bounded DP	If each num is present with count
4. Classic Value Knapsack	With values and weights	Use dp table `dp[i][w] = max value`	Standard 0/1/Unbounded knapsack

TODO Problems #

TODO Coin Change II (518) :: 2D in naive implementation

The key idea here is to solve an unbounded variant of the knapsack problem (unbounded because each coin (choice) can be used as many times as desired). The main part of this problem is to realise that it is indeed a 2D or whatever implementation, however, we can easily roll it up into a 1D array, we attempt to use each coin and update the partial sums we can create and in that way each coin can be used as many times as possible. dp[i] is the number of ways to make up that amount = i. Also realise that this a combination counting and we need to absolutely avoid double counting (and caring about permutations) that’s why the outer loop is on coins.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
   class Solution:
       def change(self, amount: int, coins: List[int]) -> int:
           dp = [0] * (amount + 1)

           # trivial base: 1 way to have empty combi
           dp[0] = 1

           # iterates through each coin denomination. This ordering ensures combinations are counted without double counting combinations (because they are permuted differently) (crucial for problems counting combinations).

           for coin in coins:
               # use the same denom as many times as possible:
               for amt in range(coin, amount + 1):
                   dp[amt] += dp[amt - coin]

           return dp[amount]

by the way, this is how we can see the non-flattened 2D approach to connect the dots on how to flatten it into 1D:

2D definition:
dp[i][j] = number of ways to make up amount j using the first i coins (i.e., coins ) (slice)
This is the traditional “table” version before you flatten.
So, base cases:
For any i, dp[i][0] = 1 (There is exactly one way to make amount 0: take nothing.)
For dp[0][j] = 0 when j > 0 (With 0 coins, can’t form any positive amount.)
So our state transitions look like this:
dp[i][j]=dp[i−1][j]+dp[i][j−coins[i−1]] if j≥coins[i−1]
dp[i][j]=dp[i−1][j] if j<coins[i−1]
which gives the following 2D version of the code
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
   n = len(coins)
   dp = [[0] * (amount + 1) for _ in range(n + 1)]
   for i in range(n + 1):
       dp[i][0] = 1

   for i in range(1, n + 1):          # using first i coins
       for j in range(amount + 1):     # for each target amount
           if j >= coins[i-1]:
               dp[i][j] = dp[i-1][j] + dp[i][j - coins[i-1]]
           else:
               dp[i][j] = dp[i-1][j]
and our answer would be at dp[n][target]
In the 1D flattening:
We roll up the i dimension (coins) into an update rule by ensuring we process coins one at a time and, for each coin, accumulate number of ways for all amounts in order.
The key is: for each coin, update dp[amt] += dp[amt - coin] iteratively for all amt ≥ coin.
That’s why we get:
1
2
3
4
5
   dp = [0] * (amount + 1)
   dp[0] = 1
   for coin in coins: # this is the "row" progression (when compared to the 2D version)
       for amt in range(coin, amount + 1):
           dp[amt] += dp[amt - coin]
The outer loop on coins is equivalent to the 2D DP’s row progression—each new coin can only build on results found so far, not reorder the combinations.

DP on Trees/Graphs #

DP that uses tree or graph structure where subproblem dependencies are on parent/child/descendants

TODO Problems #

TODO Number of Ways to Paint n × 3 Grid (1411)

Subsequence Matching & Counting with Constraints #

Count number of subsequences/paths in grid/sequence under rules

Problems #

Number of Distinct Subsequences II (940)
Tilings/partitionings (e.g., domino tiling, not covered directly in LC standard set)

Multi-dimensional/Space Reduction DP #

DP with flattening tricks, rolling arrays, or higher dimensions for optimal space

Problems #

Regular Expression Matching (10) :: rolling row/column

SOE #

Misaligned indices between strings or grids
- when doing prefix string matching between strings or whatever, we need to be careful about what our dp-table index means. To keep things simple, follow pythonic styles for slicing: s[:i] is the string slice, i could be the index within the dp-table – the corresponding string index that we’re referring to will be i - 1 though.
Incomplete or missing base row/column initialization
When considering questions like FSM-like questions, it’s important to know that not every characteristic will be explicitly stated. Therefore we should think about what rules (explicit and implicit) govern our system.
Consider the following from the “Best Time to Buy and Sell Stock with Cooldown (309)”
careful: the question might give some rules, but there’s a need to outline more than just that
in this case, the third rule is implicit but important for us:
1. You may complete as many transactions as you like (buy one and sell one share multiple times).
2. After you sell a stock, you cannot buy stock on the next day (cooldown one day).
3. You cannot hold more than one stock at a time (must sell before buying again).
Overlaps or double counting in subsequence problems
Subsequence == the relative order of the characters matter even if it’s NOT contiguous.
Subsequence of a string is a new string generated from the original string with some characters (can be none) deleted without changing the relative order of the remaining characters.

Decision Flow #

Comparing sequences? → 2D DP with state (i,j)
String transformation? → Edit distance DP
Grid traversal with constraints? → Grid DP

Style, Tricks and Boilerplate #

Style #

It seems that most of the time, contiguous sequences type questions, we can just attempt to think of prefix-collections of things. We see it here in the case of interleaving strings and a bunch of others as well.

I notice that when we do variable / array rollups for the space optimisations, many a time, to preserve the previous values, we need to do a snapshot. I’ll give some examples:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
  # this is from  problem 309 stock family with cooldown
  class Solution:
      def maxProfit(self, prices: List[int]) -> int:
          if not prices:
              return 0

          n = len(prices)

          # init 3-state dp:
          hold, sold, rest = -prices[0], 0, 0

          for i in range(1, n):
              # snapshot the old vals:
              p_hold, p_sold, p_rest = hold, sold, rest

              hold = max(
                  p_hold, # was holding yesterday also:
                  p_rest - prices[i] # cooldown ytd, can buy today
              )
              sold = p_hold + prices[i] # held until ytd, sold today
              rest = max(
                  p_rest, # rested ytd also
                  p_sold, # sold ytd, rested today
              )

          return max(
              sold, # sold on the last day
              rest # rested on the last day
          )

  # this is from problem 1143 ( longest common subsequence )
  class Solution:
      def longestCommonSubsequence(self, text1: str, text2: str) -> int:
          m, n = len(text1), len(text2)
          dp = [0] * (n + 1)  # dp[j] = LCS length for text1 prefix up to i and text2 prefix up to j

          for i in range(1, m + 1):
              # snapshot
              prev = 0  # This will hold dp[j-1] from previous iteration (the previous diagonal)
              for j in range(1, n + 1):
                  temp = dp[j]  # Save current dp[j] which will become prev in next iteration
                  # case 1: the two are the same characters => both of them should be in
                  if text1[i - 1] == text2[j - 1]:
                      dp[j] = prev + 1
                  # case 2: the two are different, we either include this or exclude this, whichever is better
                  else:
                      dp[j] = max(dp[j], dp[j - 1])
                  prev = temp  # Update prev for next j
          return dp[n]

Remember that for counting problems, we can MOD things without losing information whenever we detect and overflow:

just a note about the module and why it works: it’s because addition and multiplication commute with taking modulo; there’s no loss of information

  The reason you can keep taking modulo at every step (e.g., in `count_end_with_0 = (count_end_with_0 + count_end_with_1) % MOD`) without losing any information is due to a fundamental property of modular arithmetic: **addition and multiplication commute with taking modulo**.[1][2][3]

  ## Why Partial Sums Modulo Work

  For any modulus $$ m $$:

  - $$ (a + b) \bmod m = ((a \bmod m) + (b \bmod m)) \bmod m $$
  - $$ (a \times b) \bmod m = ((a \bmod m) \times (b \bmod m)) \bmod m $$

  This means that no matter how large the intermediate sum or product gets, *as long as you take modulo at each addition or multiplication, the final result modulo $$m$$ will be correct* — you never impact the remainder.[2][1]

  **Intuition:**
  Suppose you are adding up (potentially huge) numbers. All you care about is the remainder (modulo some value). Any excess above multiples of that modulus will always "wrap around" the modulus, so keeping only the current remainder is sufficient.[3][1][2]

  ### Application to Your Code

  The subsequence counting logic may generate astronomical numbers — but storing and updating only the current value modulo $$10^9 + 7$$ is enough, because modulo arithmetic guarantees correctness of the result's remainder. When you sum at the end, as long as each intermediate value was modulo'd, the final sum modulo $$10^9 + 7$$ matches what an exact (but potentially massive) calculation would yield.[1][2][3]

  **No information about the answer modulo $$m$$ is lost by discarding higher-order multiples of $$m$$ at each step.**

  This is what makes modulo arithmetic so robust for "counting" DP methods and combinatorial questions in programming contests and number theory.[2][3][1]

  [1](https://www2.math.upenn.edu/~mlazar/math170/notes06.pdf)
  [2](https://brilliant.org/wiki/modular-arithmetic/)
  [3](https://en.wikipedia.org/wiki/Modular_arithmetic)
  [4](https://www.youtube.com/watch?v=gJtw0c3Lo6E)
  [5](https://www.geeksforgeeks.org/engineering-mathematics/modular-arithmetic/)
  [6](https://davidaltizio.web.illinois.edu/ModularArithmetic.pdf)
  [7](https://calcworkshop.com/number-theory/modular-arithmetic/)
  [8](https://www.math.purdue.edu/~arapura/algebra/algebra8.pdf)
  [9](https://sites.math.rutgers.edu/~zeilberg/essays683/renault.html)