喜刷刷: [LeetCode] Word Break I, II

Sunday, November 23, 2014

[LeetCode] Word Break I, II

Word Break I

Given a string s and a dictionary of words dict, determine if s can be segmented into a space-separated sequence of one or more dictionary words.

For example, given
s = "leetcode",
dict = ["leet", "code"].

Return true because "leetcode" can be segmented as "leet code".

Word Break II

Given a string s and a dictionary of words dict, add spaces in s to construct a sentence where each word is a valid dictionary word.

Return all such possible sentences.

For example, given
s = "catsanddog",
dict = ["cat", "cats", "and", "sand", "dog"].

A solution is ["cats and dog", "cat sand dog"].

思路：Word Break I

看到这题第一反应是DFS枚举查找，直到“探”到string尾部则算成功。但题目并不要求给出是如何break的，而只要判断是否能break。对这类判断“是”与“否”的可以用DFS暴力解决的题，可以尝试用DP做book keeping中间判断状态，避免DFS的反复搜索。比如这题，可以用一个数组dp记录S[0:i]是否能break。

dp[i]：S[0:i-1]是否能被break。例如dp[1]表示s[0]是否能被break。

dp[0] = true

dp[i] = true if and only if:

1. 存在一个i-1>=k>=0，使s[k:i-1]存在于dict中。

2. s[0: k-1]可以被break，即dp[k] = true。

class Solution {
public:
    bool wordBreak(string s, unordered_set<string> &dict) {
        vector<bool> dp(s.size()+1,false);
        dp[0] = true;
        for(int i=0; i<s.size(); i++) {
            for(int j=i; j>=0; j--) {
                if(dp[j] && dict.count(s.substr(j,i-j+1))!=0) {
                    dp[i+1] = true;
                    break;
                }
            }
        }
        return dp[s.size()];
    }
};

思路：Word Break II

要求所有word break的可能，自然想到的是用DFS来做。但是发现居然过不了大集合，代码如下：

class Solution {
public:
    vector<string> wordBreak(string s, unordered_set<string> &dict) {
        vector<string> sentences;
        vector<string> sol;
        findWordBreak(s, dict, 0, sol, sentences);
        return sentences;
    }
    
    void findWordBreak(string &s, unordered_set<string> &dict, int start, vector<string> &sol, vector<string> &sentences) {
        if(start==s.size() && !sol.empty()) {   // find word break
            string temp = sol[0];
            for(int i=1; i<sol.size(); i++) 
                temp.append(" "+sol[i]);
            sentences.push_back(temp);
        }
        
        string word;
        for(int i=start; i<s.size(); i++) {
            word.push_back(s[i]);
            if(dict.count(word)!=0) {
                sol.push_back(word);
                findWordBreak(s, dict, i+1, sol, sentences);
                sol.pop_back();
            }
        }
    }
};

分析：

要求所有可能的word break的解集，DFS是跑不了的。所以问题的关键在于如何剪枝。

5 comments:

UnknownMay 25, 2015 at 5:55 PM
如果明确的附加条件每个词只能用一次，那就比较麻烦了，似乎DP就不容易弄了。
ReplyDelete
Replies
Brian PinJune 14, 2016 at 2:39 PM
用BFS, 同時從source 和 target 起找
ReplyDelete
Replies
Hao WuOctober 17, 2016 at 9:31 AM
Word Break II 还是可以用DP，要一个vector >把对应的DP[i]的substrings存下来。
worst case的output就是2^n，大集合并不是worst case。可以先过一遍word break I，通过再word break II。
ReplyDelete
Replies

Add comment