喜刷刷: [LeetCode] Regular Expression Matching

Saturday, November 29, 2014

[LeetCode] Regular Expression Matching

Implement regular expression matching with support for '.' and '*'.

'.' Matches any single character.
'*' Matches zero or more of the preceding element.

The matching should cover the entire input string (not partial).

The function prototype should be:
bool isMatch(const char *s, const char *p)

Some examples:
isMatch("aa","a") → false
isMatch("aa","aa") → true
isMatch("aaa","aa") → false
isMatch("aa", "a*") → true
isMatch("aa", ".*") → true
isMatch("ab", ".*") → true
isMatch("aab", "c*a*b") → true

思路1: DP

关键在于如何处理这个'*'号。

状态：和Mininum Edit Distance这类题目一样。

dp[i][j]表示s[0:i-1]是否能和p[0:j-1]匹配。

递推公式：由于只有p中会含有regular expression，所以以p[j-1]来进行分类。

p[j-1] != '.' && p[j-1] != '*'：dp[i][j] = dp[i-1][j-1] && (s[i-1] == p[j-1])

p[j-1] == '.'：dp[i][j] = dp[i-1][j-1]

而关键的难点在于 p[j-1] = '*'。由于星号可以匹配0，1，乃至多个p[j-2]。

1. 匹配0个元素，即消去p[j-2]，此时p[0: j-1] = p[0: j-3]

dp[i][j] = dp[i][j-2]

2. 匹配1个元素，此时p[0: j-1] = p[0: j-2]

dp[i][j] = dp[i][j-1]

3. 匹配多个元素，此时p[0: j-1] = { p[0: j-2], p[j-2], ... , p[j-2] }

dp[i][j] = dp[i-1][j] && (p[j-2]=='.' || s[i-2]==p[j-2])

class Solution {
public:
    bool isMatch(const char *s, const char *p) {
        int m = strlen(s), n = strlen(p);
        vector<vector<bool>> dp(m+1, vector<bool>(n+1,false));
        dp[0][0] = true;
        
        for(int i=0; i<=m; i++) {
            for(int j=1; j<=n; j++) {
                if(p[j-1]!='.' && p[j-1]!='*') {
                    if(i>0 && s[i-1]==p[j-1] && dp[i-1][j-1])
                        dp[i][j] = true;
                }
                else if(p[j-1]=='.') {
                    if(i>0 && dp[i-1][j-1])
                        dp[i][j] = true;
                }
                else if(j>1) {  //'*' cannot be the 1st element
                    if(dp[i][j-1] || dp[i][j-2])  // match 0 or 1 preceding element
                        dp[i][j] = true;
                    else if(i>0 && (p[j-2]==s[i-1] || p[j-2]=='.') && dp[i-1][j]) // match multiple preceding elements
                        dp[i][j] = true;
                }
            }
        }
        return dp[m][n];
    }
};

思路2: 双指针扫描

LeetCode作者给的解法，非常巧妙：

http://leetcode.com/2011/09/regular-expression-matching.html

8 comments:

UnknownJuly 4, 2015 at 1:11 AM
楼主您好！不太理解星号多匹配时的状态方程，您能仔细说说吗？谢谢！
ReplyDelete
Replies
UnknownFebruary 27, 2016 at 1:13 PM
第三点 “3. 匹配多个元素” 有错误。“s[i-2]==p[j-2]” 应该是s[i-1]==p[j-2]。是s当前的字符等于p当前(就是*号)的前面一个字符
ReplyDelete
Replies
UnknownDecember 9, 2016 at 6:48 PM
This comment has been removed by the author.
ReplyDelete
Replies
UnknownJuly 18, 2017 at 1:01 PM
其实匹配多个元素的转换方程里面包含了匹配1个元素的情况，不需要分开处理。如果当前p[j-1]是‘*’，我们可以先将dp[i][j]的值设为dp[i][j-2]然后去判断是否可以匹配一个或者多个。因为匹配0个或者多个（包括1个）是一个或的关系。
ReplyDelete
Replies
UnknownJanuary 8, 2018 at 10:52 PM
"aa""aa" 跑不过
ReplyDelete
Replies
UnknownMay 23, 2018 at 9:32 PM
我觉得递归比DP的方法更难理解一点
ReplyDelete
Replies

Add comment