Delete Operation for Two Strings (Leetcode 583)

Problem Description

Given two strings word1 and word2, return the minimum number of steps required to make word1 and word2 the same.

In one step, you can delete exactly one character in either string.

Examples

Input: word1 = "sea", word2 = "eat"
Output: 2
Explanation: You need one step to make "sea" to "ea" and another step to make "eat" to "ea".
Input: word1 = "leetcode", word2 = "etco"
Output: 4

Constraints

  • 1 <= word1.length, word2.length <= 500
  • word1 and word2 consist of only lowercase English letters.

Solution

The problem might sound familiar to classic dynamic programming questions however we should focus that we can only delete the words. If we were allowed to update / insert / delete the characters then it would have become the Edit Distance Problem using Dynamic Programming. Because we are only allowed using the delete operation, the problem translates to finding the number of characters that are different in both the strings (or finding the common ones?). If We can easily find the common subsequence characters and subtract double the common length from the combined length, it would give us the letters we want to delete. We can use the Longest Common Subsequence for that and we will briefly discuss about the algorithm and I would put more references in case you want to learn more about the algorithm.

Longest Common subsequence

To determine the length of the LCS between all combinations of substrings of a and b, we can use a dynamic programming technique. Dynamic programming basically means that you compute all possibilities and store them inside a look-up table.

To find the lengths of all possible subsequences, we use a helper function, lcsLength(_:). This creates a matrix of size (n+1) by (m+1), where matrix[x][y] is the length of the LCS between the substrings a[0...x-1] and b[0...y-1].

Lets take an example for "ABCBX" and "ABDCAB". The following would look something like this:

|   | Ø | A | B | D | C | A | B |
| Ø | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| A | 0 | 1 | 1 | 1 | 1 | 1 | 1 |  
| B | 0 | 1 | 2 | 2 | 2 | 2 | 2 |
| C | 0 | 1 | 2 | 2 | 3 | 3 | 3 |
| B | 0 | 1 | 2 | 2 | 3 | 3 | 4 |
| X | 0 | 1 | 2 | 2 | 3 | 3 | 4 |

Note that the elements first row and the first column are 0 since we are comparing with the empty string. In this example, if we look at matrix[3][4] we find the value 3. This means the length of the LCS between a[0...2] and b[0...3], or between "ABC" and "ABDC", is 3.

Code

Swift

class Solution {
func minDistance(_ word1: String, _ word2: String) -> Int {
let longestCommonLength = lcsLength(word1, word2)
return word1.count + word2.count 2 * longestCommonLength
}
func lcsLength(_ s1: String,_ s2:String) -> Int {
var matrix = [[Int]](repeating: [Int](repeating: 0, count: s2.count+1), count: s1.count+1)
for (i, char1) in s1.enumerated() {
for (j, char2) in s2.enumerated() {
if char1 == char2 {
// Common char found, add 1 to highest lcs found so far.
matrix[i+1][j+1] = matrix[i][j] + 1
} else {
// Not a match, propagates highest lcs length found so far.
matrix[i+1][j+1] = max(matrix[i][j+1], matrix[i+1][j])
}
}
}
return matrix[s1.count][s2.count]
}
}
view raw min_distance.swift hosted with ❤ by GitHub

Python

class Solution:
def lcs(self,X , Y):
m = len(X)
n = len(Y)
L = [[None]*(n+1) for i in range(m+1)]
for i in range(m+1):
for j in range(n+1):
if i == 0 or j == 0 :
L[i][j] = 0
elif X[i1] == Y[j1]:
L[i][j] = L[i1][j1]+1
else:
L[i][j] = max(L[i1][j] , L[i][j1])
# L[m][n] contains the length of LCS of X[0..n-1] & Y[0..m-1]
return L[m][n]
def minDistance(self, word1: str, word2: str) -> int:
minCommonLength = self.lcs(word1,word2)
return len(word1) + len(word2) 2 * minCommonLength
view raw min_distance.py hosted with ❤ by GitHub

Java

class Solution {
int lcs( char[] X, char[] Y, int m, int n )
{
int L[][] = new int[m+1][n+1];
/* Following steps build L[m+1][n+1] in bottom up fashion. Note
that L[i][j] contains length of LCS of X[0..i-1] and Y[0..j-1] */
for (int i=0; i<=m; i++)
{
for (int j=0; j<=n; j++)
{
if (i == 0 || j == 0)
L[i][j] = 0;
else if (X[i1] == Y[j1])
L[i][j] = L[i1][j1] + 1;
else
L[i][j] = max(L[i1][j], L[i][j1]);
}
}
return L[m][n];
}
int max(int a, int b)
{
return (a > b)? a : b;
}
public int minDistance(String s1, String s2) {
char[] X=s1.toCharArray();
char[] Y=s2.toCharArray();
int m = X.length;
int n = Y.length;
int lcsLength = lcs(X,Y,m, n);
return m + n 2 *lcsLength;
}
}

Complexity Analysis

The time and space complexity would remain same as the LCS operation. We are using two nested loops to compute the DP array hence m * n operations where m, n are the length of the input strings. Space complexity would also be same for saving 2D matrix.

Time = O(M * N)

Space = O(M * N)

Links

Edit Distance | Wiki

Dynamic Programming | Wiki

Edit Distance using Dynamic Programming | GeeksForGeeks

Longest Common subsequence | GeeksForGeeks

Leave a Reply

Your email address will not be published. Required fields are marked *