欧美bbbwbbbw肥妇,免费乱码人妻系列日韩,一级黄片

C語(yǔ)言字符串壓縮之ZSTD算法詳解

 更新時(shí)間:2022年08月10日 16:52:10   作者:T-BARBARIANS  
快速壓縮工具zstd(zstandard)是由facebook開源的快速無(wú)損壓縮算法,主要應(yīng)用于zlib級(jí)別的實(shí)時(shí)壓縮場(chǎng)景,并且具有更好的壓縮比。本文將來(lái)講講ZSTD算法的使用,需要的可以參考一下

前言

最近項(xiàng)目上有大量的字符串?dāng)?shù)據(jù)需要存儲(chǔ)到內(nèi)存,并且需要儲(chǔ)存至一定時(shí)間,于是自然而然的想到了使用字符串壓縮算法對(duì)“源串”進(jìn)行壓縮存儲(chǔ)。由此觸發(fā)了對(duì)一些優(yōu)秀壓縮算法的調(diào)研。

字符串壓縮,我們通常的需求有幾個(gè),一是高壓縮率,二是壓縮速率高,三是解壓速率高。不過(guò)高壓縮率與高壓縮速率是魚和熊掌的關(guān)系,不可皆得,優(yōu)秀的算法一般也是采用壓縮率與性能折中的方案。從壓縮率、壓縮速率、解壓速率考慮,zstd與lz4有較好的壓縮與解壓性能,最終選取zstd與lz4進(jìn)行調(diào)研。

zstd是facebook開源的提供高壓縮比的快速壓縮算法(參考https://github.com/facebook/zstd),很想了解一下它在壓縮與解壓方面的實(shí)際表現(xiàn)。

一、zstd壓縮與解壓

ZSTD_compress屬于ZSTD的Simple API范疇,只有壓縮級(jí)別可以設(shè)置。

ZSTD_compress函數(shù)原型如下:

size_t ZSTD_compress(void* dst, size_t dstCapacity, const void* src, size_t srcSize, int compressionLevel)

ZSTD_decompress函數(shù)原型如下:

size_t ZSTD_decompress( void* dst, size_t dstCapacity, const void* src, size_t compressedSize);  我們先來(lái)看看zstd的壓縮與解壓縮示例。

#include <stdio.h>
#include <string.h>
#include <sys/time.h>
#include <malloc.h>
#include <zstd.h>
#include <iostream>

using namespace std;

int main()
{
    // compress
    size_t com_space_size;
    size_t peppa_pig_text_size;

    char *com_ptr = NULL;
    char peppa_pig_buf[2048] = "Narrator: It is raining today. So, Peppa and George cannot play outside.Peppa: Daddy, it's stopped raining. Can we go out to play?Daddy: Alright, run along you two.Narrator: Peppa loves jumping in muddy puddles.Peppa: I love muddy puddles.Mummy: Peppa. If you jumping in muddy puddles, you must wear your boots.Peppa: Sorry, Mummy.Narrator: George likes to jump in muddy puddles, too.Peppa: George. If you jump in muddy puddles, you must wear your boots.Narrator: Peppa likes to look after her little brother, George.Peppa: George, let's find some more pud dles.Narrator: Peppa and George are having a lot of fun. Peppa has found a lttle puddle. George hasfound a big puddle.Peppa: Look, George. There's a really big puddle.Narrator: George wants to jump into the big puddle first.Peppa: Stop, George. | must check if it's safe for you. Good. It is safe for you. Sorry, George. It'sonly mud.Narrator: Peppa and George love jumping in muddy puddles.Peppa: Come on, George. Let's go and show Daddy.Daddy: Goodness me.Peppa: Daddy. Daddy. Guess what we' ve been doing.Daddy: Let me think... Have you been wa tching television?Peppa: No. No. Daddy.Daddy: Have you just had a bath?Peppa: No. No.Daddy: | know. You've been jumping in muddy puddles.Peppa: Yes. Yes. Daddy. We've been jumping in muddy puddles.Daddy: Ho. Ho. And look at the mess you're in.Peppa: Oooh....Daddy: Oh, well, it's only mud. Let's clean up quickly before Mummy sees the mess.Peppa: Daddy, when we've cleaned up, will you and Mummy Come and play, too?Daddy: Yes, we can all play in the garden.Narrator: Peppa and George are wearing their boots. Mummy and Daddy are wearing their boots.Peppa loves jumping up and down in muddy puddles. Everyone loves jumping up and down inmuddy puddles.Mummy: Oh, Daddy pig, look at the mess you're in. .Peppa: It's only mud.";

    peppa_pig_text_size = strlen(peppa_pig_buf);
    com_space_size= ZSTD_compressBound(peppa_pig_text_size);
    com_ptr = (char *)malloc(com_space_size);
    if(NULL == com_ptr) {
        cout << "compress malloc failed" << endl;
        return -1;
    }

    size_t com_size;
    com_size = ZSTD_compress(com_ptr, com_space_size, peppa_pig_buf, peppa_pig_text_size, ZSTD_fast);
    cout << "peppa pig text size:" << peppa_pig_text_size << endl;
    cout << "compress text size:" << com_size << endl;
    cout << "compress ratio:" << (float)peppa_pig_text_size / (float)com_size << endl << endl;


    // decompress
    char* decom_ptr = NULL;
    unsigned long long decom_buf_size;
    decom_buf_size = ZSTD_getFrameContentSize(com_ptr, com_size);

    decom_ptr = (char *)malloc((size_t)decom_buf_size);
    if(NULL == decom_ptr) {
        cout << "decompress malloc failed" << endl;
        return -1;
    }

    size_t decom_size;
    decom_size = ZSTD_decompress(decom_ptr, decom_buf_size, com_ptr, com_size);
    cout << "decompress text size:" << decom_size << endl;

    if(strncmp(peppa_pig_buf, decom_ptr, peppa_pig_text_size)) {
        cout << "decompress text is not equal peppa pig text" << endl;
    }

    free(com_ptr);
    free(decom_ptr);
    return 0;
}

執(zhí)行結(jié)果:

從結(jié)果可以發(fā)現(xiàn),壓縮之前的peppa pig文本長(zhǎng)度為1827,壓縮后的文本長(zhǎng)度為759,壓縮率為2.4,解壓后的長(zhǎng)度與壓縮前相等。

另外,上文提到可以調(diào)整ZSTD_compress函數(shù)的壓縮級(jí)別,zstd的默認(rèn)級(jí)別為ZSTD_CLEVEL_DEFAULT = 3,最小值為0,最大值為ZSTD_MAX_CLEVEL = 22。另外也提供一些策略設(shè)置,例如 ZSTD_fast, ZSTD_greedy, ZSTD_lazy, ZSTD_lazy2, ZSTD_btlazy2。壓縮級(jí)別越高,壓縮率越高,但是壓縮速率越低。

二、ZSTD壓縮與解壓性能探索

上面探索了zstd的基礎(chǔ)壓縮與解壓方法,接下來(lái)再摸索一下zstd的壓縮與解壓縮性能。

測(cè)試方法是,使用ZSTD_compress連續(xù)壓縮同一段文本并持續(xù)10秒,最后得到每一秒的平均壓縮速率。測(cè)試壓縮性能的代碼示例如下:

#include <stdio.h>
#include <string.h>
#include <sys/time.h>
#include <malloc.h>
#include <zstd.h>
#include <iostream>

using namespace std;

int main()
{
    int cnt = 0;

    size_t com_size;
    size_t com_space_size;
    size_t peppa_pig_text_size;

    char *com_ptr = NULL;
    char peppa_pig_buf[2048] = "Narrator: It is raining today. So, Peppa and George cannot play outside.Peppa: Daddy, it's stopped raining. Can we go out to play?Daddy: Alright, run along you two.Narrator: Peppa loves jumping in muddy puddles.Peppa: I love muddy puddles.Mummy: Peppa. If you jumping in muddy puddles, you must wear your boots.Peppa: Sorry, Mummy.Narrator: George likes to jump in muddy puddles, too.Peppa: George. If you jump in muddy puddles, you must wear your boots.Narrator: Peppa likes to look after her little brother, George.Peppa: George, let's find some more pud dles.Narrator: Peppa and George are having a lot of fun. Peppa has found a lttle puddle. George hasfound a big puddle.Peppa: Look, George. There's a really big puddle.Narrator: George wants to jump into the big puddle first.Peppa: Stop, George. | must check if it's safe for you. Good. It is safe for you. Sorry, George. It'sonly mud.Narrator: Peppa and George love jumping in muddy puddles.Peppa: Come on, George. Let's go and show Daddy.Daddy: Goodness me.Peppa: Daddy. Daddy. Guess what we' ve been doing.Daddy: Let me think... Have you been wa tching television?Peppa: No. No. Daddy.Daddy: Have you just had a bath?Peppa: No. No.Daddy: | know. You've been jumping in muddy puddles.Peppa: Yes. Yes. Daddy. We've been jumping in muddy puddles.Daddy: Ho. Ho. And look at the mess you're in.Peppa: Oooh....Daddy: Oh, well, it's only mud. Let's clean up quickly before Mummy sees the mess.Peppa: Daddy, when we've cleaned up, will you and Mummy Come and play, too?Daddy: Yes, we can all play in the garden.Narrator: Peppa and George are wearing their boots. Mummy and Daddy are wearing their boots.Peppa loves jumping up and down in muddy puddles. Everyone loves jumping up and down inmuddy puddles.Mummy: Oh, Daddy pig, look at the mess you're in. .Peppa: It's only mud.";

    timeval st, et;

    peppa_pig_text_size = strlen(peppa_pig_buf);
    com_space_size= ZSTD_compressBound(peppa_pig_text_size);

    gettimeofday(&st, NULL);
    while(1) {

        com_ptr = (char *)malloc(com_space_size);
        com_size = ZSTD_compress(com_ptr, com_space_size, peppa_pig_buf, peppa_pig_text_size, ZSTD_fast);

        free(com_ptr);
        cnt++;

        gettimeofday(&et, NULL);
        if(et.tv_sec - st.tv_sec >= 10) {
            break;
        }
    }

    cout << "compress per second:" << cnt/10 << " times" << endl;
    return 0;
}

執(zhí)行結(jié)果:

結(jié)果顯示ZSTD的壓縮性能大概在每秒6-7萬(wàn)次左右,這個(gè)結(jié)果其實(shí)并不是太理想。需要說(shuō)明的是壓縮性能與待壓縮文本的長(zhǎng)度、字符內(nèi)容也是有關(guān)系的。

我們?cè)賮?lái)探索一下ZSTD的解壓縮性能。與上面的測(cè)試方法類似,先對(duì)本文進(jìn)行壓縮,然后連續(xù)解壓同一段被壓縮過(guò)的數(shù)據(jù)并持續(xù)10秒,最后得到每一秒的平均解壓速率。測(cè)試解壓性能的代碼示例如下:

#include <stdio.h>
#include <string.h>
#include <sys/time.h>
#include <malloc.h>
#include <zstd.h>
#include <iostream>

using namespace std;

int main()
{
    int cnt = 0;

    size_t com_size;
    size_t com_space_size;
    size_t peppa_pig_text_size;

    timeval st, et;

    char *com_ptr = NULL;
    char peppa_pig_buf[2048] = "Narrator: It is raining today. So, Peppa and George cannot play outside.Peppa: Daddy, it's stopped raining. Can we go out to play?Daddy: Alright, run along you two.Narrator: Peppa loves jumping in muddy puddles.Peppa: I love muddy puddles.Mummy: Peppa. If you jumping in muddy puddles, you must wear your boots.Peppa: Sorry, Mummy.Narrator: George likes to jump in muddy puddles, too.Peppa: George. If you jump in muddy puddles, you must wear your boots.Narrator: Peppa likes to look after her little brother, George.Peppa: George, let's find some more pud dles.Narrator: Peppa and George are having a lot of fun. Peppa has found a lttle puddle. George hasfound a big puddle.Peppa: Look, George. There's a really big puddle.Narrator: George wants to jump into the big puddle first.Peppa: Stop, George. | must check if it's safe for you. Good. It is safe for you. Sorry, George. It'sonly mud.Narrator: Peppa and George love jumping in muddy puddles.Peppa: Come on, George. Let's go and show Daddy.Daddy: Goodness me.Peppa: Daddy. Daddy. Guess what we' ve been doing.Daddy: Let me think... Have you been wa tching television?Peppa: No. No. Daddy.Daddy: Have you just had a bath?Peppa: No. No.Daddy: | know. You've been jumping in muddy puddles.Peppa: Yes. Yes. Daddy. We've been jumping in muddy puddles.Daddy: Ho. Ho. And look at the mess you're in.Peppa: Oooh....Daddy: Oh, well, it's only mud. Let's clean up quickly before Mummy sees the mess.Peppa: Daddy, when we've cleaned up, will you and Mummy Come and play, too?Daddy: Yes, we can all play in the garden.Narrator: Peppa and George are wearing their boots. Mummy and Daddy are wearing their boots.Peppa loves jumping up and down in muddy puddles. Everyone loves jumping up and down inmuddy puddles.Mummy: Oh, Daddy pig, look at the mess you're in. .Peppa: It's only mud.";

    size_t decom_size;
    char* decom_ptr = NULL;
    unsigned long long decom_buf_size;

    peppa_pig_text_size = strlen(peppa_pig_buf);
    com_space_size= ZSTD_compressBound(peppa_pig_text_size);
    com_ptr = (char *)malloc(com_space_size);

    com_size = ZSTD_compress(com_ptr, com_space_size, peppa_pig_buf, peppa_pig_text_size, 1);

    gettimeofday(&st, NULL);
    decom_buf_size = ZSTD_getFrameContentSize(com_ptr, com_size);

    while(1) {

        decom_ptr = (char *)malloc((size_t)decom_buf_size);

        decom_size = ZSTD_decompress(decom_ptr, decom_buf_size, com_ptr, com_size);
        if(decom_size != peppa_pig_text_size) {

            cout << "decompress error" << endl;
            break;
        }

        free(decom_ptr);

        cnt++;
        gettimeofday(&et, NULL);
        if(et.tv_sec - st.tv_sec >= 10) {
            break;
        }
    }

    cout << "decompress per second:" << cnt/10 << " times" << endl;

    free(com_ptr);
    return 0;
}

執(zhí)行結(jié)果:

結(jié)果顯示ZSTD的解壓縮性能大概在每秒12萬(wàn)次左右,解壓性能比壓縮性能高。

三、zstd的高級(jí)用法

zstd提供了一個(gè)名為PZSTD的壓縮和解壓工具。PZSTD(parallel zstd),并行壓縮的zstd,是一個(gè)使用多線程對(duì)待壓縮文本進(jìn)行切片分段,且進(jìn)行并行壓縮的命令行工具。

其實(shí)高版本(v1.4.0及以上)的zstd也提供了指定多線程對(duì)文本進(jìn)行并行壓縮的相關(guān)API接口,也就是本小節(jié)要介紹的zstd高級(jí)API用法。下面我們?cè)賮?lái)探索一下zstd的多線程壓縮使用方法。

多線程并行壓縮的兩個(gè)關(guān)鍵API,一個(gè)是參數(shù)設(shè)置API,另一個(gè)是壓縮API。

參數(shù)設(shè)置API的原型是:

size_t ZSTD_CCtx_setParameter(ZSTD_CCtx* cctx, ZSTD_cParameter param, int value)

壓縮API的原型是:

size_t ZSTD_compress2(ZSTD_CCtx* cctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize)

下面給出zstd并行壓縮的示例demo,通過(guò)ZSTD_CCtx_setParameter設(shè)置線程數(shù)為3,即指定宏ZSTD_c_nbWorkers為3,通過(guò)ZSTD_compress2壓縮相關(guān)文本。另外,為了展示zstd確實(shí)使用了多線程,需要先讀取一個(gè)非常大的文件,作為zstd的壓縮文本源,盡量使zstd運(yùn)行較長(zhǎng)時(shí)間。

#include <stdio.h>
#include <string.h>
#include <sys/time.h>
#include <malloc.h>
#include <zstd.h>
#include <iostream>

using namespace std;

int main()
{
    size_t com_size;
    size_t com_space_size;

    FILE *fp = NULL;
    unsigned int file_len;

    char *com_ptr = NULL;
    char *file_text_ptr = NULL;

    fp = fopen("xxxxxx", "r");
    if(NULL == fp){
         cout << "file open failed" << endl;
         return -1;
    }

    fseek(fp, 0, SEEK_END);
    file_len = ftell(fp);
    fseek(fp, 0, SEEK_SET);
    cout << "file length:" << file_len << endl;

    // malloc space for file content
    file_text_ptr = (char *)malloc(file_len);
    if(NULL == file_text_ptr) {
        cout << "malloc failed" << endl;
        return -1;
    }

    // malloc space for compress space
    com_space_size = ZSTD_compressBound(file_len);
    com_ptr = (char *)malloc(com_space_size);
    if(NULL == com_ptr) {
        cout << "malloc failed" << endl;
        return -1;
    }

    // read text from source file
    fread(file_text_ptr, 1, file_len, fp);
    fclose(fp);

    ZSTD_CCtx* cctx;
    cctx = ZSTD_createCCtx();

    // set multi-thread parameter
    ZSTD_CCtx_setParameter(cctx, ZSTD_c_nbWorkers, 3);
    ZSTD_CCtx_setParameter(cctx, ZSTD_c_compressionLevel, ZSTD_btlazy2);

    com_size = ZSTD_compress2(cctx, com_ptr, com_space_size, file_text_ptr, file_len);

    free(com_ptr);
    free(file_text_ptr);
    return 0;
}

運(yùn)行上述demo,可見zstd確實(shí)啟動(dòng)了3個(gè)線程對(duì)文本進(jìn)行了并行壓縮。且設(shè)置的線程數(shù)越多,壓縮時(shí)間越短,這里就不詳細(xì)展示了,讀者可以自行實(shí)驗(yàn)。

需要說(shuō)明的是,zstd當(dāng)前默認(rèn)編譯單線程的庫(kù)文件,要實(shí)現(xiàn)多線程的API調(diào)用,需要在make的時(shí)候指定編譯參數(shù)ZSTD_MULTITHREAD。

另外,zstd還支持線程池的方式,線程池的函數(shù)原型:

POOL_ctx* ZSTD_createThreadPool(size_t numThreads)

線程池可以避免在多次、連續(xù)壓縮場(chǎng)景時(shí)頻繁的去創(chuàng)建線程、撤銷線程產(chǎn)生的非必要開銷,使得算力主要開銷在文本壓縮方面。

四、總結(jié)

本篇分享了zstd壓縮與解壓縮使用的基本方法,對(duì)壓縮與解壓的性能進(jìn)行了摸底,最后探索了zstd多線程壓縮的使用方法。

從壓縮測(cè)試來(lái)看,zstd的壓縮比其實(shí)已經(jīng)比較好了,比原文所占用空間縮小了一半以上,當(dāng)然壓縮比也跟待壓縮文本的內(nèi)容有關(guān)。

從性能執(zhí)行結(jié)果來(lái)看,zstd的壓縮與解壓性能表現(xiàn)比較勉強(qiáng),我認(rèn)為zstd在魚(性能)和熊掌(壓縮比)之間更偏向熊掌一些,不過(guò)對(duì)一些性能要求不太高的,但是要高壓縮比的場(chǎng)景是比較符合的。

多線程并行壓縮,在有大文本需要連續(xù)多次壓縮的場(chǎng)景下,結(jié)合線程池可以很好的提升壓縮速率。

以上就是C語(yǔ)言字符串壓縮之ZSTD算法詳解的詳細(xì)內(nèi)容,更多關(guān)于C語(yǔ)言字符串壓縮的資料請(qǐng)關(guān)注腳本之家其它相關(guān)文章!

相關(guān)文章

  • C語(yǔ)言實(shí)現(xiàn)簡(jiǎn)單五子棋小游戲

    C語(yǔ)言實(shí)現(xiàn)簡(jiǎn)單五子棋小游戲

    這篇文章主要為大家詳細(xì)介紹了C語(yǔ)言實(shí)現(xiàn)簡(jiǎn)單五子棋小游戲,文中示例代碼介紹的非常詳細(xì),具有一定的參考價(jià)值,感興趣的小伙伴們可以參考一下
    2020-08-08
  • C語(yǔ)言三子棋小游戲?qū)崿F(xiàn)全程

    C語(yǔ)言三子棋小游戲?qū)崿F(xiàn)全程

    三子棋是一種民間傳統(tǒng)游戲,又叫九宮棋、圈圈叉叉、一條龍、井字棋等。將正方形對(duì)角線連起來(lái),相對(duì)兩邊依次擺上三個(gè)雙方棋子,只要將自己的三個(gè)棋子走成一條線,對(duì)方就算輸了,想用c語(yǔ)言做出這個(gè)游戲,事實(shí)上也是比較簡(jiǎn)單的,下面通過(guò)c語(yǔ)言進(jìn)行對(duì)五子棋的分析
    2022-05-05
  • C語(yǔ)言實(shí)現(xiàn)紙牌24點(diǎn)小游戲

    C語(yǔ)言實(shí)現(xiàn)紙牌24點(diǎn)小游戲

    這篇文章主要為大家詳細(xì)介紹了C語(yǔ)言實(shí)現(xiàn)紙牌24點(diǎn)小游戲,文中示例代碼介紹的非常詳細(xì),具有一定的參考價(jià)值,感興趣的小伙伴們可以參考一下
    2019-10-10
  • 一個(gè)快速排序算法代碼分享

    一個(gè)快速排序算法代碼分享

    一個(gè)快速排序算法代碼一個(gè)快速排序算法代碼,代碼內(nèi)有注釋,大家參考使用吧
    2014-01-01
  • C++ 類的構(gòu)造函數(shù)詳解及實(shí)例

    C++ 類的構(gòu)造函數(shù)詳解及實(shí)例

    這篇文章主要介紹了C++ 類的構(gòu)造函數(shù)詳解及實(shí)例的相關(guān)資料,學(xué)習(xí)C++ 的朋友對(duì)構(gòu)造函數(shù)肯定不陌生,非常重要的基礎(chǔ)知識(shí),這里就詳細(xì)介紹下,需要的朋友可以參考下
    2016-12-12
  • C語(yǔ)言自增(++)和自減(--)實(shí)例詳解

    C語(yǔ)言自增(++)和自減(--)實(shí)例詳解

    本篇文章主要介紹了C語(yǔ)言的自增和自減的基本知識(shí),并附有代碼示例,以便大家理解,有需要的朋友可以看下
    2016-07-07
  • 在C++中自定義宏的簡(jiǎn)單方法

    在C++中自定義宏的簡(jiǎn)單方法

    這篇文章主要介紹了在C++中自定義宏的簡(jiǎn)單方法,作者建議使用類似定義函數(shù)一樣的方法來(lái)定義宏,需要的朋友可以參考下
    2015-07-07
  • OpenCV和C++實(shí)現(xiàn)圖像的翻轉(zhuǎn)(鏡像)、平移、旋轉(zhuǎn)、仿射與透視變換

    OpenCV和C++實(shí)現(xiàn)圖像的翻轉(zhuǎn)(鏡像)、平移、旋轉(zhuǎn)、仿射與透視變換

    這篇文章主要給大家介紹了關(guān)于OpenCV和C++實(shí)現(xiàn)圖像的翻轉(zhuǎn)(鏡像)、平移、旋轉(zhuǎn)、仿射與透視變換的相關(guān)資料,文中通過(guò)示例代碼介紹的非常詳細(xì),需要的朋友可以參考下
    2021-09-09
  • 你真的理解C語(yǔ)言qsort函數(shù)嗎?帶你深度剖析qsort函數(shù)

    你真的理解C語(yǔ)言qsort函數(shù)嗎?帶你深度剖析qsort函數(shù)

    這篇文章主要介紹了你真的理解C語(yǔ)言qsort函數(shù)嗎?帶你深度剖析qsort函數(shù),本篇將引入一個(gè)庫(kù)函數(shù)來(lái)實(shí)現(xiàn)我們希望的順序,結(jié)合示例代碼給大家介紹的非常詳細(xì),需要的朋友可以參考下
    2023-02-02
  • 淺談C++有理數(shù)的表達(dá)和計(jì)算

    淺談C++有理數(shù)的表達(dá)和計(jì)算

    這篇文章主要為大家詳細(xì)介紹了C++有理數(shù)的表達(dá)和計(jì)算,文中示例代碼介紹的非常詳細(xì),具有一定的參考價(jià)值,感興趣的小伙伴們可以參考一下
    2021-11-11

最新評(píng)論