30. Substring with Concatenation of All Words (Hard)

You are given a string, s, and a list of words, words, that are all of the same length. Find all starting indices of substring(s) in s that is a concatenation of each word in words exactly once and without any intervening characters.

For example, given: s: "barfoothefoobarman" words: ["foo", "bar"]

You should return the indices: [0,9]. (order does not matter).

Solution 1: Hash Table 195ms

这道题让我们求串联所有单词的子串，就是说给定一个长字符串，再给定几个长度相同的单词，让我们找出串联给定所有单词的子串的起始位置，还是蛮有难度的一道题。这道题我们需要用到两个哈希表，第一个哈希表先把所有的单词存进去，然后从开头开始一个个遍历，停止条件为当剩余字符个数小于单词集里所有字符的长度。这时候我们需要定义第二个哈希表，然后每次找出给定单词长度的子串，看其是否在第一个哈希表里，如果没有，则break，如果有，则加入第二个哈希表，但相同的词只能出现一次，如果多了，也break。如果正好匹配完给定单词集里所有的单词，则把i存入结果中，具体参见代码如下：

class Solution {
public:
    vector<int> findSubstring(string s, vector<string>& words) {
        unordered_map<string, int> m1;
        for (auto& w: words) ++m1[w];

        int n = words.size(), m = words[0].size();
        vector<int> res;
        cout << "hi\n";
        for (int i = 0; i <= (int)s.size()-m*n; ++i) {
            unordered_map<string, int> m2;
            int j = 0;
            for (j = 0; j < n; ++j) {
                string key = s.substr(i+j*m, m);
                if (m1.count(key) == 0) break;
                ++m2[key];
                if (m2[key] > m1[key]) break;
            }
            if (j == n) res.push_back(i);
        }
        return res;
    }
};

Solution 2: Hash Table, Two Pointers 32ms

Use sliding window Time Complexity: $$O(n)$$

// travel all the words combinations to maintain a window // there are wl(word len) times travel // each time, n/wl words, mostly 2 times travel for each word // one left side of the window, the other right side of the window // so, time complexity O(wl 2 N) = O(2N*wl)

这道题还有一种O(n)时间复杂度的解法，设计思路非常巧妙，但是感觉很难想出来，博主目测还未到达这种水平。这种方法不再是一个字符一个字符的遍历，而是一个词一个词的遍历，比如根据题目中的例子，字符串s的长度n为18，words数组中有两个单词(cnt=2)，每个单词的长度len均为3，那么遍历的顺序为0，3，6，8，12，15，然后偏移一个字符1，4，7，9，13，16，然后再偏移一个字符2，5，8，10，14，17，这样就可以把所有情况都遍历到，我们还是先用一个哈希表m1来记录words里的所有词，然后我们从0开始遍历，用left来记录左边界的位置，count表示当前已经匹配的单词的个数。然后我们一个单词一个单词的遍历，如果当前遍历的到的单词t在m1中存在，那么我们将其加入另一个哈希表m2中，如果在m2中个数小于等于m1中的个数，那么我们count自增1，如果大于了，那么需要做一些处理，比如下面这种情况, s = barfoofoo, words = {bar, foo, abc}, 我们给words中新加了一个abc，目的是为了遍历到barfoo不会停止，那么当遍历到第二foo的时候, m2[foo]=2, 而此时m1[foo]=1，这是后已经不连续了，所以我们要移动左边界left的位置，我们先把第一个词t1=bar取出来，然后将m2[t1]自减1，如果此时m2[t1]<m1[t1]了，说明一个匹配没了，那么对应的count也要自减1，然后左边界加上个len，这样就可以了。如果某个时刻count和cnt相等了，说明我们成功匹配了一个位置，那么将当前左边界left存入结果res中，此时去掉最左边的一个词，同时count自减1，左边界右移len，继续匹配。如果我们匹配到一个不在m1中的词，那么说明跟前面已经断开了，我们重置m2，count为0，左边界left移到j+len，参见代码如下：

class Solution {
public:
    vector<int> findSubstring(string s, vector<string>& words) {
        unordered_map<string,int> m;
        for (auto& w: words) ++m[w];

        int len = words[0].size(), n = s.size();
        vector<int> res;

        for (int i = 0; i < len; ++i) {
            int left = i, count = words.size();
            unordered_map<string, int> m2;
            for (int right = i; right < n; right += len) {
                string key = s.substr(right, len);
                if (m.count(key)) {
                    ++m2[key];
                    if (m2[key] <= m[key]) {
                        --count;
                    } else {
                        while (m2[key] > m[key]) {
                            string word = s.substr(left, len);
                            --m2[word];
                            if (m2[word] < m[word]) ++count;
                            left += len;
                        }
                    }

                    if (count == 0) {
                        res.push_back(left);
                        --m2[s.substr(left,len)];
                        ++count;
                        left += len;
                    }
                } else {
                    m2.clear();
                    count = words.size();
                    left = right+len;
                }
                //cout << left << " " << right << endl;
            }
        }
        return res;
    }
};

30. Substring with Concatenation of All Words (Hard)

30. Substring with Concatenation of All Words (Hard)

results matching ""

No results matching ""