Implement regular expression matching with support for
'.'
and '*'
.'.' Matches any single character.
'*' Matches zero or more of the preceding element.
The matching should cover the entire input string (not partial).
The function prototype should be:
bool isMatch(const char *s, const char *p)
Some examples:
isMatch("aa","a") → false
isMatch("aa","aa") → true
isMatch("aaa","aa") → false
isMatch("aa", "a*") → true
isMatch("aa", ".*") → true
isMatch("ab", ".*") → true
isMatch("aab", "c*a*b") → true
思路1: DP
关键在于如何处理这个'*'号。
状态:和Mininum Edit Distance这类题目一样。
dp[i][j]表示s[0:i-1]是否能和p[0:j-1]匹配。
递推公式:由于只有p中会含有regular expression,所以以p[j-1]来进行分类。
p[j-1] != '.' && p[j-1] != '*':dp[i][j] = dp[i-1][j-1] && (s[i-1] == p[j-1])
p[j-1] == '.':dp[i][j] = dp[i-1][j-1]
而关键的难点在于 p[j-1] = '*'。由于星号可以匹配0,1,乃至多个p[j-2]。
1. 匹配0个元素,即消去p[j-2],此时p[0: j-1] = p[0: j-3]
dp[i][j] = dp[i][j-2]
2. 匹配1个元素,此时p[0: j-1] = p[0: j-2]
dp[i][j] = dp[i][j-1]
3. 匹配多个元素,此时p[0: j-1] = { p[0: j-2], p[j-2], ... , p[j-2] }
dp[i][j] = dp[i-1][j] && (p[j-2]=='.' || s[i-2]==p[j-2])
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | class Solution { public: bool isMatch(const char *s, const char *p) { int m = strlen(s), n = strlen(p); vector<vector<bool>> dp(m+1, vector<bool>(n+1,false)); dp[0][0] = true; for(int i=0; i<=m; i++) { for(int j=1; j<=n; j++) { if(p[j-1]!='.' && p[j-1]!='*') { if(i>0 && s[i-1]==p[j-1] && dp[i-1][j-1]) dp[i][j] = true; } else if(p[j-1]=='.') { if(i>0 && dp[i-1][j-1]) dp[i][j] = true; } else if(j>1) { //'*' cannot be the 1st element if(dp[i][j-1] || dp[i][j-2]) // match 0 or 1 preceding element dp[i][j] = true; else if(i>0 && (p[j-2]==s[i-1] || p[j-2]=='.') && dp[i-1][j]) // match multiple preceding elements dp[i][j] = true; } } } return dp[m][n]; } }; |
LeetCode作者给的解法,非常巧妙:
http://leetcode.com/2011/09/regular-expression-matching.html
http://leetcode.com/2011/09/regular-expression-matching.html
楼主您好!不太理解星号多匹配时的状态方程,您能仔细说说吗?谢谢!
ReplyDelete第三点 “3. 匹配多个元素” 有错误。“s[i-2]==p[j-2]” 应该是s[i-1]==p[j-2]。 是s当前的字符等于p当前(就是*号)的前面一个字符
ReplyDeleteThis is right
DeleteI think `if(dp[i][j-1] || dp[i][j-2])` is wrong, it should be `if(dp[i][j-2])`, which means p[j-1]p[j-2] is treated as ''.
DeleteThis comment has been removed by the author.
ReplyDelete其实匹配多个元素的转换方程里面包含了匹配1个元素的情况,不需要分开处理。如果当前p[j-1]是‘*’,我们可以先将dp[i][j]的值设为dp[i][j-2]然后去判断是否可以匹配一个或者多个。因为匹配0个或者多个(包括1个)是一个或的关系。
ReplyDelete"aa""aa" 跑不过
ReplyDelete我觉得递归比DP的方法更难理解一点
ReplyDelete