快捷導(dǎo)航

C++實現(xiàn)LeetCode(692.前K個高頻詞)

更新時間：2021年08月09日 15:32:53 作者：Grandyang

這篇文章主要介紹了C++實現(xiàn)LeetCode(692.前K個高頻詞),本篇文章通過簡要的案例,講解了該項技術(shù)的了解與使用,以下就是詳細內(nèi)容,需要的朋友可以參考下

[LeetCode] 692.Top K Frequent Words 前K個高頻詞

Given a non-empty list of words, return the k most frequent elements.

Your answer should be sorted by frequency from highest to lowest. If two words have the same frequency, then the word with the lower alphabetical order comes first.

Example 1:

Input: ["i", "love", "leetcode", "i", "love", "coding"], k = 2
Output: ["i", "love"]
Explanation: "i" and "love" are the two most frequent words.
Note that "i" comes before "love" due to a lower alphabetical order.

Example 2:

Input: ["the", "day", "is", "sunny", "the", "the", "the", "sunny", "is", "is"], k = 4
Output: ["the", "is", "sunny", "day"]
Explanation: "the", "is", "sunny" and "day" are the four most frequent words,
with the number of occurrence being 4, 3, 2 and 1 respectively.

Note:

You may assume k is always valid, 1 ≤ k ≤ number of unique elements.
Input words contain only lowercase letters.

Follow up:

Try to solve it in O(n log k) time and O(n) extra space.
Can you solve it in O(n) time with only O(k) extra space?

這道題讓我們求前K個高頻詞，跟之前那道題 Top K Frequent Elements 極其類似，換了個數(shù)據(jù)類型就又是一道新題。唯一的不同就是之前那道題對于出現(xiàn)頻率相同的數(shù)字，沒有順序要求。而這道題對于出現(xiàn)頻率相同的單詞，需要按照字母順序來排。但是解法都一樣，還是用最小堆和桶排序的方法。首先來看最小堆的方法，思路是先建立每個單詞和其出現(xiàn)次數(shù)之間的映射，然后把單詞和頻率的pair放進最小堆，如果沒有相同頻率的單詞排序要求，我們完全可以讓頻率當作pair的第一項，這樣priority_queue默認是以pair的第一項為key進行從大到小的排序，而當?shù)谝豁椣嗟葧r，又會以第二項由大到小進行排序，這樣第一項的排序方式就與題目要求的相同頻率的單詞要按字母順序排列不相符，當然我們可以在存入結(jié)果res時對相同頻率的詞進行重新排序處理，也可以對priority_queue的排序機制進行自定義，這里我們采用第二種方法，我們自定義排序機制，我們讓a.second > b.second，讓小頻率的詞在第一位，然后當a.second == b.second時，我們讓a.first < b.first，這是讓字母順序大的排在前面（這里博主需要強調(diào)一點的是，priority_queue的排序機制的寫法和vector的sort的排序機制的寫法正好順序相反，同樣的寫法，用在sort里面就是頻率小的在前面，不信的話可以自己試一下）。定義好最小堆后，我們首先統(tǒng)計單詞的出現(xiàn)頻率，然后組成pair排序最小堆之中，我們只保存k個pair，超過了就把隊首的pair移除隊列，最后我們把單詞放入結(jié)果res中即可，參見代碼如下：

解法一：

class Solution {
public:
    vector<string> topKFrequent(vector<string>& words, int k) {
        vector<string> res(k);
        unordered_map<string, int> freq;
        auto cmp = [](pair<string, int>& a, pair<string, int>& b) {
            return a.second > b.second || (a.second == b.second && a.first < b.first);
        };
        priority_queue<pair<string, int>, vector<pair<string, int>>, decltype(cmp) > q(cmp);
        for (auto word : words) ++freq[word];
        for (auto f : freq) {
            q.push(f);
            if (q.size() > k) q.pop();
        }
        for (int i = res.size() - 1; i >= 0; --i) {
            res[i] = q.top().first; q.pop();
        }
        return res;
    }
};

下面這種解法還是一種堆排序的思路，這里我們用map，來建立次數(shù)和出現(xiàn)該次數(shù)所有單詞的集合set之間的映射，這里也利用了set能自動排序的特性，當然我們還是需要首先建立每個單詞和其出現(xiàn)次數(shù)的映射，然后將其組成pair放入map種，map是從小到大排序的，這樣我們從最后面取pair，就是次數(shù)最大的，每次取出一層中所有的單詞，如果此時的k大于該層的單詞個數(shù)，就將整層的單詞加入結(jié)果res中，否則就取前K個就行了，取完要更更新K值，如果K小于等于0了，就break掉，返回結(jié)果res即可，參見代碼如下：

解法二：

class Solution {
public:
    vector<string> topKFrequent(vector<string>& words, int k) {
        vector<string> res;
        unordered_map<string, int> freq;
        map<int, set<string>> m;
        for (string word : words) ++freq[word];
        for (auto a : freq) {
            m[a.second].insert(a.first);
        }
        for (auto it = m.rbegin(); it != m.rend(); ++it) {
            if (k <= 0) break;
            auto t = it->second;
            vector<string> v(t.begin(), t.end());
            if (k >= t.size()) {
                res.insert(res.end(), v.begin(), v.end());
            } else {
                res.insert(res.end(), v.begin(), v.begin() + k);
            }
            k -= t.size();
        }
        return res;
    }
};

下面這種解法是一種桶排序的思路，我們根據(jù)出現(xiàn)次數(shù)建立多個bucket，桶的個數(shù)不會超過單詞的個數(shù)，在每個桶中，我們對單詞按字符順序進行排序。我們可以用個數(shù)組來表示桶，每一層中放一個集合，利用set的自動排序的功能，使其能按字母順序排列。我們還是需要首先建立每個單詞和其出現(xiàn)次數(shù)的映射，然后將其組成pair放入map種，map是從小到大排序的，這樣我們倒序遍歷所有的桶，這樣取pair，就是次數(shù)最大的，每次取出一層中所有的單詞，如果此時的k大于該層的單詞個數(shù)，就將整層的單詞加入結(jié)果res中，否則就取前K個就行了，取完要更更新K值，如果K小于等于0了，就break掉，返回結(jié)果res即可，參見代碼如下：

解法三：

class Solution {
public:
    vector<string> topKFrequent(vector<string>& words, int k) {
        vector<string> res;
        unordered_map<string, int> freq;
        vector<set<string>> v(words.size() + 1, set<string>());
        for (string word : words) ++freq[word];
        for (auto a : freq) {
            v[a.second].insert(a.first);
        }
        for (int i = v.size() - 1; i >= 0; --i) {
            if (k <= 0) break;
            vector<string> t(v[i].begin(), v[i].end());
            if (k >= t.size()) {
                res.insert(res.end(), t.begin(), t.end());
            } else {
                res.insert(res.end(), t.begin(), t.begin() + k);
            }
            k -= t.size();
        }
        return res;
    }
};

類似題目：

Top K Frequent Elements

Design Search Autocomplete System

參考資料：

https://discuss.leetcode.com/topic/106861/o-nlog-k-priority-queue-c-code

https://discuss.leetcode.com/topic/106868/clean-heap-based-solution-o-nlogk-time-and-o-n-space-16ms

到此這篇關(guān)于C++實現(xiàn)LeetCode(692.前K個高頻詞)的文章就介紹到這了,更多相關(guān)C++實現(xiàn)前K個高頻詞內(nèi)容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家！

您可能感興趣的文章: