Java如何實(shí)現(xiàn)數(shù)據(jù)壓縮所有方式性能測(cè)試
1 BZip方式
ZIP文件格式是一種數(shù)據(jù)壓縮和文檔儲(chǔ)存的文件格式,原名Deflate,發(fā)明者為菲爾·卡茨(Phil Katz),他于1989年1月公布了該格式的資料。ZIP通常使用后綴名“.zip”,它的MIME格式為application/zip。當(dāng)前,ZIP格式屬于幾種主流的壓縮格式之一,其競(jìng)爭(zhēng)者包括RAR格式以及開放源碼的7z格式。從性能上比較,RAR及7z格式較ZIP格式壓縮率較高,而7-Zip由于提供了免費(fèi)的壓縮工具而逐漸在更多的領(lǐng)域得到應(yīng)用。
Microsoft從Windows ME操作系統(tǒng)開始內(nèi)置對(duì)zip格式的支持,即使用戶的計(jì)算機(jī)上沒有安裝解壓縮軟件,也能打開和制作zip格式的壓縮文件,OS X和流行的Linux操作系統(tǒng)也對(duì)zip格式提供了類似的支持。因此如果在網(wǎng)絡(luò)上傳播和分發(fā)文件,zip格式往往是最常用的選擇。
1.1 引入依賴
<dependency>
<groupId>org.apache.ant</groupId>
<artifactId>ant</artifactId>
<version>1.10.6</version>
</dependency>1.2 BZip工具類代碼
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import org.apache.tools.bzip2.CBZip2InputStream;
import org.apache.tools.bzip2.CBZip2OutputStream;
public class BZip2Util {
private static final int BUFFER_SIZE = 8192;
public static byte[] compress(byte[] bytes) {
if (bytes == null) {
throw new NullPointerException("bytes is null");
}
ByteArrayOutputStream bos = new ByteArrayOutputStream();
try (CBZip2OutputStream bzip2 = new CBZip2OutputStream(bos)) {
bzip2.write(bytes);
bzip2.finish();
return bos.toByteArray();
} catch (IOException e) {
throw new RuntimeException("BZip2 compress error", e);
}
}
public static byte[] decompress(byte[] bytes) {
if (bytes == null) {
throw new NullPointerException("bytes is null");
}
ByteArrayOutputStream out = new ByteArrayOutputStream();
ByteArrayInputStream bis = new ByteArrayInputStream(bytes);
try (CBZip2InputStream bzip2 = new CBZip2InputStream(bis)) {
byte[] buffer = new byte[BUFFER_SIZE];
int n;
while ((n = bzip2.read(buffer)) > -1) {
out.write(buffer, 0, n);
}
return out.toByteArray();
} catch (IOException e) {
throw new RuntimeException("BZip2 decompress error", e);
}
}
}1.3 BZip2工具類代碼
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import org.apache.tools.bzip2.CBZip2InputStream;
import org.apache.tools.bzip2.CBZip2OutputStream;
public class BZip2Util {
private static final int BUFFER_SIZE = 8192;
public static byte[] compress(byte[] bytes) {
if (bytes == null) {
throw new NullPointerException("bytes is null");
}
ByteArrayOutputStream bos = new ByteArrayOutputStream();
try (CBZip2OutputStream bzip2 = new CBZip2OutputStream(bos)) {
bzip2.write(bytes);
bzip2.finish();
return bos.toByteArray();
} catch (IOException e) {
throw new RuntimeException("BZip2 compress error", e);
}
}
public static byte[] decompress(byte[] bytes) {
if (bytes == null) {
throw new NullPointerException("bytes is null");
}
ByteArrayOutputStream out = new ByteArrayOutputStream();
ByteArrayInputStream bis = new ByteArrayInputStream(bytes);
try (CBZip2InputStream bzip2 = new CBZip2InputStream(bis)) {
byte[] buffer = new byte[BUFFER_SIZE];
int n;
while ((n = bzip2.read(buffer)) > -1) {
out.write(buffer, 0, n);
}
return out.toByteArray();
} catch (IOException e) {
throw new RuntimeException("BZip2 decompress error", e);
}
}
}2 Deflater方式
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.util.zip.Deflater;
import java.util.zip.Inflater;
public class DeflaterUtil {
private DeflaterUtil() {
}
private static final int BUFFER_SIZE = 8192;
public static byte[] compress(byte[] bytes) {
if (bytes == null) {
throw new NullPointerException("bytes is null");
}
int lenght = 0;
Deflater deflater = new Deflater();
deflater.setInput(bytes);
deflater.finish();
byte[] outputBytes = new byte[BUFFER_SIZE];
try (ByteArrayOutputStream bos = new ByteArrayOutputStream()) {
while (!deflater.finished()) {
lenght = deflater.deflate(outputBytes);
bos.write(outputBytes, 0, lenght);
}
deflater.end();
return bos.toByteArray();
} catch (IOException e) {
throw new RuntimeException("Deflater compress error", e);
}
}
public static byte[] decompress(byte[] bytes) {
if (bytes == null) {
throw new NullPointerException("bytes is null");
}
int length = 0;
Inflater inflater = new Inflater();
inflater.setInput(bytes);
byte[] outputBytes = new byte[BUFFER_SIZE];
try (ByteArrayOutputStream bos = new ByteArrayOutputStream();) {
while (!inflater.finished()) {
length = inflater.inflate(outputBytes);
if (length == 0) {
break;
}
bos.write(outputBytes, 0, length);
}
inflater.end();
return bos.toByteArray();
} catch (Exception e) {
throw new RuntimeException("Deflater decompress error", e);
}
}
}3 Gzip方式
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.util.zip.GZIPInputStream;
import java.util.zip.GZIPOutputStream;
public class GzipUtil {
private GzipUtil() {
}
private static final int BUFFER_SIZE = 8192;
public static byte[] compress(byte[] bytes) {
if (bytes == null) {
throw new NullPointerException("bytes is null");
}
ByteArrayOutputStream out = new ByteArrayOutputStream();
try (GZIPOutputStream gzip = new GZIPOutputStream(out)) {
gzip.write(bytes);
gzip.flush();
gzip.finish();
return out.toByteArray();
} catch (IOException e) {
throw new RuntimeException("gzip compress error", e);
}
}
public static byte[] decompress(byte[] bytes) {
if (bytes == null) {
throw new NullPointerException("bytes is null");
}
ByteArrayOutputStream out = new ByteArrayOutputStream();
try (GZIPInputStream gunzip = new GZIPInputStream(new ByteArrayInputStream(bytes))) {
byte[] buffer = new byte[BUFFER_SIZE];
int n;
while ((n = gunzip.read(buffer)) > -1) {
out.write(buffer, 0, n);
}
return out.toByteArray();
} catch (IOException e) {
throw new RuntimeException("gzip decompress error", e);
}
}
}4 Lz4方式
4.1 簡(jiǎn)介
Lz4壓縮算法是由Yann Collet在2011年設(shè)計(jì)實(shí)現(xiàn)的,lz4屬于lz77系列的壓縮算法。lz77嚴(yán)格意義上來說不是一種算法,而是一種編碼理論,它只定義了原理,并沒有定義如何實(shí)現(xiàn)。基于lz77理論衍生的算法除lz4以外,還有l(wèi)zss、lzb、lzh等。
lz4是目前基于綜合來看效率最高的壓縮算法,更加側(cè)重于壓縮解壓縮速度,壓縮比并不突出,本質(zhì)上就是時(shí)間換空間。
對(duì)于github上給出的lz4性能介紹:每核壓縮速度大于500MB/s,多核CPU可疊加;它所提供的解碼器也是極其快速的,每核可達(dá)GB/s量級(jí)。
4.2 算法思想
- lz77編碼思想:它是一種基于字典的算法,它將長(zhǎng)字符串(也可以稱為匹配項(xiàng)或者短語)編碼成短小的標(biāo)記,用小標(biāo)記代替字典中的短語,也就是說,它通過用小的標(biāo)記來代替數(shù)據(jù)中多次重復(fù)出現(xiàn)的長(zhǎng)字符串來達(dá)到數(shù)據(jù)壓縮的目的。其處理的符號(hào)不一定是文本字符,也可以是其他任意大小的符號(hào)。
- 短語字典維護(hù):lz77使用的是一個(gè)前向緩沖區(qū)和一個(gè)滑動(dòng)窗口。它首先將數(shù)據(jù)載入到前向緩沖區(qū),形成一批短語,再由滑動(dòng)窗口滑動(dòng)時(shí),變成字典的一部分。
4.3 算法實(shí)現(xiàn)
4.3.1 lz4數(shù)據(jù)格式
- lz4實(shí)現(xiàn)了兩種格式,分別是lz4_block_format和lz4_frame_format。
- lz4_frame_format用于特殊場(chǎng)景,如file壓縮、pipe壓縮和流式壓縮;這里主要介紹lz4_block_format(一般場(chǎng)景使用格式)
壓縮塊有多個(gè)序列組成,一個(gè)序列是由一組字面量(非壓縮字節(jié)),后跟一個(gè)匹配副本。每個(gè)序列以token開始,字面量和匹配副本的長(zhǎng)度是有token以及offset決定的。
- literals指沒有重復(fù)、首次出現(xiàn)的字節(jié)流,即不可壓縮的部分
- literals length指不可壓縮部分的長(zhǎng)度
- match length指重復(fù)項(xiàng)(可以壓縮的部分)長(zhǎng)度
下圖為單個(gè)序列的數(shù)據(jù)格式,一個(gè)完整的lz4壓縮塊是由多個(gè)序列組成的。

2、lz4壓縮過程
lz4遵循上面說到的lz77思想理論,通過滑動(dòng)窗口、hash表、數(shù)據(jù)編碼等操作實(shí)現(xiàn)數(shù)據(jù)壓縮。壓縮過程以至少4字節(jié)為掃描窗口查找匹配,每次移動(dòng)1字節(jié)進(jìn)行掃描,遇到重復(fù)的就進(jìn)行壓縮。
舉個(gè)例子:給出一個(gè)字符串: abcde_fghabcde_ghxxahcde,描述出此字符串的壓縮過程
ps:我們按照6字節(jié)掃描窗口,每次1字節(jié)來進(jìn)行掃描

- 假設(shè)lz4的滑動(dòng)窗口大小為6字節(jié),掃描窗口為1字節(jié);
- lz4開始掃描,首先對(duì)0-5位置做hash運(yùn)算,hash表中無該值,所以存入hash表;
- 向后掃描,開始計(jì)算1-6位置hash值,hash表中依然無此值,所以繼續(xù)將hash值存入hash表;
- 掃描過程依次類推,直到圖中例子,在計(jì)算9-15位置的hash值時(shí),發(fā)現(xiàn)hash表中已經(jīng)存在,則進(jìn)行壓縮,偏移量offset值置為9,重復(fù)長(zhǎng)度為6,該值存入token值的低4位中;
- 匹配壓縮項(xiàng)后開始嘗試擴(kuò)大匹配,當(dāng)窗口掃描到10-16時(shí),發(fā)現(xiàn)并沒有匹配到,則將此值存入hash表;如果發(fā)現(xiàn)hash表中有值,如果符合匹配條件(例如10-15符合1-6)則擴(kuò)大匹配項(xiàng),重復(fù)長(zhǎng)度設(shè)為7,調(diào)整相應(yīng)的token值
- 這樣滑動(dòng)窗口掃描完所有的字符串之后,結(jié)束操作
最終,這樣壓縮過程就結(jié)束了,得到這樣一個(gè)字節(jié)串[-110, 97, 98, 99, 100, 101, 95, 102, 103, 104, 9, 0, -112, 103, 104, 120, 120, 97, 104, 99, 100, 101]。大家可能在看到這段內(nèi)容可能有些懵逼,我在解壓過程解釋一下。
3、lz4解壓過程
- lz4壓縮串: [-110, 97, 98, 99, 100, 101, 95, 102, 103, 104, 9, 0, -112, 103, 104, 120, 120, 97, 104, 99, 100, 101]
- 二進(jìn)制是字符串經(jīng)過utf-8編碼后的值
下圖是對(duì)上面壓縮串的解釋:

這里簡(jiǎn)單記錄下解壓的過程:
- 當(dāng)lz4解壓從0開始遍歷時(shí),先判斷token值(-110),-110轉(zhuǎn)換為計(jì)算機(jī)二進(jìn)制為10010010,高四位1001代表字面量長(zhǎng)度為9,低四位0010代表重復(fù)項(xiàng)匹配的長(zhǎng)度2+4(minimum repeated bytes)
- 向后遍歷9位,得到長(zhǎng)度為9的字符串(abcde_fgh),偏移量為9,從當(dāng)前位置向前移動(dòng)9位則是重復(fù)位起始位置,低四位說明重復(fù)項(xiàng)長(zhǎng)度為6字節(jié),則繼續(xù)生成長(zhǎng)度為6的字符串(abcde_)
- 此時(shí)生成(abcde_fghabcde_),接著開始判斷下一sequence token起始位,最終生成abcde_fghabcde_ghxxahcde(壓縮前的字符串)
4.4 Lz4-Java
lz4/lz4-java是由Rei Odaira等人寫的一套使用lz4壓縮的Java類庫(kù)。
4.4.1 簡(jiǎn)介
該類庫(kù)提供了對(duì)兩種壓縮方法的訪問,他們都能生成有效的lz4流:
快速掃描(lz4)
- 內(nèi)存占用少(16KB)
- 非???/li>
- 合理的壓縮比(取決于輸入的冗余度)
高壓縮(lz4hc)
- 內(nèi)存占用中等(256KB)
- 相當(dāng)慢(比lz4慢10倍)
- 良好的壓縮比(取決于輸入的大小和冗余度)
這兩種壓縮算法產(chǎn)生的流使用相同的壓縮格式,解壓縮速度非常快,可以由相同的解壓縮實(shí)例解壓縮
4.4.2 類庫(kù)
該類庫(kù)提供了幾個(gè)關(guān)鍵類,這里簡(jiǎn)單介紹一下
LZ4Factory
Lz4 API的入口點(diǎn),該類有3個(gè)實(shí)例
- 一個(gè)native實(shí)例,它是與原始LZ4 C實(shí)現(xiàn)的JNI綁定
- 一個(gè)safe Java實(shí)例,它是原始C庫(kù)的純Java端口(Java 官方編寫的API)
- 一個(gè)unsafe Java實(shí)例,它是使用非官方sun.misc.Unsafe API的Java端口(Unsafe類可用來直接訪問系統(tǒng)內(nèi)存資源并進(jìn)行自主管理,其在提升Java運(yùn)行效率,增強(qiáng)Java語言底層操作能力方面起到很大的作用,Unsafe可認(rèn)為是Java中留下的后門,提供了一些低層次操作,如直接內(nèi)存訪問、線程調(diào)度等)
只有safe Java實(shí)例才能保證在JVM上工作,因此建議使用fastestInstance()或fastestJavaInstance()來拉取LZ4Factory實(shí)例。
LZ4Compressor
壓縮器有兩種,一種是fastCompressor,也就是lz4簡(jiǎn)介中說的快速掃描壓縮器;另一種是highCompressor,是實(shí)現(xiàn)高壓縮率壓縮器(lz4hc)。
LZ4Decompressor
lz4-java提供了兩個(gè)解壓器:LZ4FastDecompressor;LZ4SafeDecompressor
兩者不同點(diǎn)在于:LZ4FastDecompressor在解壓縮時(shí)是已知源字符串長(zhǎng)度,而LZ4SafeDecompressor在解壓縮時(shí)是已知壓縮字段的長(zhǎng)度
使用:
上面說到的兩個(gè)壓縮器和兩個(gè)解壓縮器,在壓縮和解壓縮的時(shí)候,是可以互換的,比如說FastCompressor可以和LZ4SafeDecompressor搭配使用這樣,因?yàn)閮煞N壓縮算法生成的流格式是一樣的,無論用哪個(gè)解壓縮器都能解壓。
在說完上面基本的類之后,再來看下lz4-Java類庫(kù)給我們提供流式傳輸類:LZ4BlockOutputStream(輸出流-編碼)、LZ4BlockInputStream(輸入流-解碼)
下面這段代碼是使用示例:
package com.oldlu.compress.utils;
import net.jpountz.lz4.*;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.StringReader;
import java.io.UnsupportedEncodingException;
import java.nio.charset.StandardCharsets;
public class Lz4Utils {
private static final int ARRAY_SIZE = 4096;
private static LZ4Factory factory = LZ4Factory.fastestInstance();
private static LZ4Compressor compressor = factory.fastCompressor();
private static LZ4FastDecompressor decompressor = factory.fastDecompressor();
private static LZ4SafeDecompressor safeDecompressor = factory.safeDecompressor();
public static byte[] compress(byte[] bytes) {
if (bytes == null || bytes.length == 0) {
return null;
}
try {
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
LZ4BlockOutputStream lz4BlockOutputStream = new LZ4BlockOutputStream(outputStream, ARRAY_SIZE, compressor);
lz4BlockOutputStream.write(bytes);
lz4BlockOutputStream.finish();
return outputStream.toByteArray();
} catch (Exception e) {
System.err.println("Lz4壓縮失敗");
}
return null;
}
public static byte[] uncompress(byte[] bytes) {
if (bytes == null || bytes.length == 0) {
return null;
}
try {
ByteArrayOutputStream outputStream = new ByteArrayOutputStream(ARRAY_SIZE);
ByteArrayInputStream inputStream = new ByteArrayInputStream(bytes);
LZ4BlockInputStream decompressedInputStream = new LZ4BlockInputStream(inputStream, decompressor);
int count;
byte[] buffer = new byte[ARRAY_SIZE];
while ((count = decompressedInputStream.read(buffer)) != -1) {
outputStream.write(buffer, 0, count);
}
return outputStream.toByteArray();
} catch (Exception e) {
System.err.println("lz4解壓縮失敗");
}
return null;
}
public static void main(String[] args) {
byte[] bytes = "abcde_fghabcde_ghxxahcde".getBytes(StandardCharsets.UTF_8);
byte[] compress = compress(bytes);
byte[] decompress = uncompress(compress);
}
}
5 SevenZ方式
5.1 引入依賴
<dependency>
<groupId>org.tukaani</groupId>
<artifactId>xz</artifactId>
<version>1.8</version>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-compress</artifactId>
<version>1.19</version>
</dependency>5.2 工具類代碼
import org.apache.commons.compress.archivers.sevenz.SevenZArchiveEntry;
import org.apache.commons.compress.archivers.sevenz.SevenZFile;
import org.apache.commons.compress.archivers.sevenz.SevenZOutputFile;
import org.apache.commons.compress.utils.SeekableInMemoryByteChannel;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
public class SevenZUtil {
private static final int BUFFER_SIZE = 8192;
public static byte[] compress(byte[] bytes) {
if (bytes == null) {
throw new NullPointerException("bytes is null");
}
SeekableInMemoryByteChannel channel = new SeekableInMemoryByteChannel();
try (SevenZOutputFile z7z = new SevenZOutputFile(channel)) {
SevenZArchiveEntry entry = new SevenZArchiveEntry();
entry.setName("sevenZip");
entry.setSize(bytes.length);
z7z.putArchiveEntry(entry);
z7z.write(bytes);
z7z.closeArchiveEntry();
z7z.finish();
return channel.array();
} catch (IOException e) {
throw new RuntimeException("SevenZ compress error", e);
}
}
public static byte[] decompress(byte[] bytes) {
if (bytes == null) {
throw new NullPointerException("bytes is null");
}
ByteArrayOutputStream out = new ByteArrayOutputStream();
SeekableInMemoryByteChannel channel = new SeekableInMemoryByteChannel(bytes);
try (SevenZFile sevenZFile = new SevenZFile(channel)) {
byte[] buffer = new byte[BUFFER_SIZE];
while (sevenZFile.getNextEntry() != null) {
int n;
while ((n = sevenZFile.read(buffer)) > -1) {
out.write(buffer, 0, n);
}
}
return out.toByteArray();
} catch (IOException e) {
throw new RuntimeException("SevenZ decompress error", e);
}
}
}6 Zip方式
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.util.zip.ZipEntry;
import java.util.zip.ZipInputStream;
import java.util.zip.ZipOutputStream;
public class ZipUtil {
private static final int BUFFER_SIZE = 8192;
public static byte[] compress(byte[] bytes) {
if (bytes == null) {
throw new NullPointerException("bytes is null");
}
ByteArrayOutputStream out = new ByteArrayOutputStream();
try (ZipOutputStream zip = new ZipOutputStream(out)) {
ZipEntry entry = new ZipEntry("zip");
entry.setSize(bytes.length);
zip.putNextEntry(entry);
zip.write(bytes);
zip.closeEntry();
return out.toByteArray();
} catch (IOException e) {
throw new RuntimeException("Zip compress error", e);
}
}
public static byte[] decompress(byte[] bytes) {
if (bytes == null) {
throw new NullPointerException("bytes is null");
}
ByteArrayOutputStream out = new ByteArrayOutputStream();
try (ZipInputStream zip = new ZipInputStream(new ByteArrayInputStream(bytes))) {
byte[] buffer = new byte[BUFFER_SIZE];
while (zip.getNextEntry() != null) {
int n;
while ((n = zip.read(buffer)) > -1) {
out.write(buffer, 0, n);
}
}
return out.toByteArray();
} catch (IOException e) {
throw new RuntimeException("Zip decompress error", e);
}
}
}7 性能對(duì)比
我們可以使用它來和其他壓縮類進(jìn)行一個(gè)性能對(duì)比
測(cè)試源代碼:
package com.oldlu.compress.test;
import com.oldlu.compress.domain.User;
import com.oldlu.compress.service.UserService;
import com.oldlu.compress.utils.*;
import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.results.format.ResultFormatType;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import java.util.concurrent.TimeUnit;
@BenchmarkMode(Mode.Throughput)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
public class PerformanceTest {
/**
* 用來序列化的用戶對(duì)象
*/
@State(Scope.Benchmark)
public static class CommonState {
User user;
byte[] originBytes;
byte[] lz4CompressBytes;
byte[] snappyCompressBytes;
byte[] gzipCompressBytes;
byte[] bzipCompressBytes;
byte[] deflateCompressBytes;
@Setup(Level.Trial)
public void prepare() {
UserService userService = new UserService();
user = userService.get();
originBytes = ProtostuffUtils.serialize(user);
lz4CompressBytes = Lz4Utils.compress(originBytes);
snappyCompressBytes = SnappyUtils.compress(originBytes);
gzipCompressBytes = GzipUtils.compress(originBytes);
bzipCompressBytes = Bzip2Utils.compress(originBytes);
deflateCompressBytes = DeflateUtils.compress(originBytes);
}
}
/**
* Lz4壓縮
*
* @param commonState
* @return
*/
@Benchmark
public byte[] lz4Compress(CommonState commonState) {
return Lz4Utils.compress(commonState.originBytes);
}
/**
* lz4解壓縮
*
* @param commonState
*/
@Benchmark
public byte[] lz4Uncompress(CommonState commonState) {
return Lz4Utils.uncompress(commonState.lz4CompressBytes);
}
/**
* snappy壓縮
*
* @param commonState
* @return
*/
@Benchmark
public byte[] snappyCompress(CommonState commonState) {
return SnappyUtils.compress(commonState.originBytes);
}
/**
* snappy解壓縮
*
* @param commonState
* @return
*/
@Benchmark
public byte[] snappyUncompress(CommonState commonState) {
return SnappyUtils.uncompress(commonState.snappyCompressBytes);
}
/**
* Gzip壓縮
*
* @param commonState
* @return
*/
@Benchmark
public byte[] gzipCompress(CommonState commonState) {
return GzipUtils.compress(commonState.originBytes);
}
/**
* Gzip解壓縮
*
* @param commonState
* @return
*/
@Benchmark
public byte[] gzipUncompress(CommonState commonState) {
return GzipUtils.uncompress(commonState.gzipCompressBytes);
}
/**
* bzip2壓縮
*
* @param commonState
* @return
*/
@Benchmark
public byte[] bzip2Compress(CommonState commonState) {
return Bzip2Utils.compress(commonState.originBytes);
}
/**
* bzip2壓縮
*
* @param commonState
* @return
*/
@Benchmark
public byte[] bzip2Uncompress(CommonState commonState) {
return Bzip2Utils.uncompress(commonState.bzipCompressBytes);
}
/**
* bzip2壓縮
*
* @param commonState
* @return
*/
@Benchmark
public byte[] deflateCompress(CommonState commonState) {
return DeflateUtils.compress(commonState.originBytes);
}
/**
* bzip2壓縮
*
* @param commonState
* @return
*/
@Benchmark
public byte[] deflateUncompress(CommonState commonState) {
return DeflateUtils.uncompress(commonState.deflateCompressBytes);
}
public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(PerformanceTest.class.getSimpleName())
.forks(1)
.threads(1)
.warmupIterations(10)
.measurementIterations(10)
.result("PerformanceTest.json")
.resultFormat(ResultFormatType.JSON).build();
new Runner(opt).run();
}
}性能測(cè)試圖:
附上lz4官網(wǎng)給出的性能測(cè)試圖和自己測(cè)試的性能圖,有些差異,有可能對(duì)于壓縮數(shù)據(jù)的不同導(dǎo)致的差異。
- 官網(wǎng)給的:

- 手工測(cè):

在公司對(duì)于特征內(nèi)容的壓縮,觀察lz4和snappy的對(duì)比,看上去lz4和snappy的壓縮和解壓縮的性能差不多,但lz4更穩(wěn)定些,尖刺場(chǎng)景少。由于設(shè)計(jì)公司內(nèi)部?jī)?nèi)容,就不粘圖了。
7.1 壓縮率對(duì)比
在壓縮率上,按照從高到低是:bzip2 > Deflate > Gzip > lz4 > snappy
package com.oldlu.compress.demo;
import com.alibaba.fastjson.JSONObject;
import com.oldlu.compress.domain.User;
import com.oldlu.compress.service.UserService;
import com.oldlu.compress.utils.*;
import java.nio.charset.StandardCharsets;
public class CompressDemo {
public static void main(String[] args) {
User user = new UserService().get();
// json序列化
byte[] origin_json = JSONObject.toJSONBytes(user);
System.out.println("原始json字節(jié)數(shù): " + origin_json.length);
// pb序列化
byte[] origin = ProtostuffUtils.serialize(user);
System.out.println("原始pb字節(jié)數(shù): " + origin.length);
testGzip(origin, user);
testSnappy(origin, user);
testLz4(origin, user);
testBzip2(origin, user);
testDeflate(origin, user);
}
private static void test(){
System.out.println("--------------------");
String str = getString();
byte[] source = str.getBytes(StandardCharsets.UTF_8);
byte[] compress = Lz4Utils.compress(source);
// 將compress轉(zhuǎn)為字符串
System.out.println(translateString(compress));
System.out.println();
System.out.println("--------------------");
String str2 = getString2();
byte[] source2 = str2.getBytes(StandardCharsets.UTF_8);
byte[] compress2 = Lz4Utils.compress(source2);
byte[] uncompress = Lz4Utils.uncompress(compress2);
System.out.println();
}
private static String translateString(byte[] bytes) {
char[] chars = new char[bytes.length];
for (int i = 0; i < chars.length; i++) {
chars[i] = (char) bytes[i];
}
String str = new String(chars);
return str;
}
private static String getString() {
return "fghabcde_bcdefgh_abcdefghxxxxxxx";
}
private static String getString2() {
return "abcde_fghabcde_ghxxahcde";
}
private static void testGzip(byte[] origin, User user) {
System.out.println("---------------GZIP壓縮---------------");
// Gzip壓縮
byte[] gzipCompress = GzipUtils.compress(origin);
System.out.println("Gzip壓縮: " + gzipCompress.length);
byte[] gzipUncompress = GzipUtils.uncompress(gzipCompress);
System.out.println("Gzip解壓縮: " + gzipUncompress.length);
User deUser = ProtostuffUtils.deserialize(gzipUncompress, User.class);
System.out.println("對(duì)象是否相等: " + user.equals(deUser));
}
private static void testSnappy(byte[] origin, User user) {
System.out.println("---------------Snappy壓縮---------------");
// Snappy壓縮
byte[] snappyCompress = SnappyUtils.compress(origin);
System.out.println("Snappy壓縮: " + snappyCompress.length);
byte[] snappyUncompress = SnappyUtils.uncompress(snappyCompress);
System.out.println("Snappy解壓縮: " + snappyUncompress.length);
User deUser = ProtostuffUtils.deserialize(snappyUncompress, User.class);
System.out.println("對(duì)象是否相等: " + user.equals(deUser));
}
private static void testLz4(byte[] origin, User user) {
System.out.println("---------------Lz4壓縮---------------");
// Lz4壓縮
byte[] Lz4Compress = Lz4Utils.compress(origin);
System.out.println("Lz4壓縮: " + Lz4Compress.length);
byte[] Lz4Uncompress = Lz4Utils.uncompress(Lz4Compress);
System.out.println("Lz4解壓縮: " + Lz4Uncompress.length);
User deUser = ProtostuffUtils.deserialize(Lz4Uncompress, User.class);
System.out.println("對(duì)象是否相等: " + user.equals(deUser));
}
private static void testBzip2(byte[] origin, User user) {
System.out.println("---------------bzip2壓縮---------------");
// bzip2壓縮
byte[] bzip2Compress = Bzip2Utils.compress(origin);
System.out.println("bzip2壓縮: " + bzip2Compress.length);
byte[] bzip2Uncompress = Bzip2Utils.uncompress(bzip2Compress);
System.out.println("bzip2解壓縮: " + bzip2Uncompress.length);
User deUser = ProtostuffUtils.deserialize(bzip2Uncompress, User.class);
System.out.println("對(duì)象是否相等: " + user.equals(deUser));
}
private static void testDeflate(byte[] origin, User user) {
System.out.println("---------------Deflate壓縮---------------");
// Deflate壓縮
byte[] deflateCompress = DeflateUtils.compress(origin);
System.out.println("Deflate壓縮: " + deflateCompress.length);
byte[] deflateUncompress = DeflateUtils.uncompress(deflateCompress);
System.out.println("Deflate解壓縮: " + deflateUncompress.length);
User deUser = ProtostuffUtils.deserialize(deflateUncompress, User.class);
System.out.println("對(duì)象是否相等: " + user.equals(deUser));
}
}
原始json字節(jié)數(shù): 5351
原始pb字節(jié)數(shù): 3850
---------------GZIP壓縮---------------
Gzip壓縮: 2170
Gzip解壓縮: 3850
對(duì)象是否相等: true
---------------Snappy壓縮---------------
Snappy壓縮: 3396
Snappy解壓縮: 3850
對(duì)象是否相等: true
---------------Lz4壓縮---------------
Lz4壓縮: 3358
Lz4解壓縮: 3850
對(duì)象是否相等: true
---------------bzip2壓縮---------------
bzip2壓縮: 2119
bzip2解壓縮: 3850
對(duì)象是否相等: true
---------------Deflate壓縮---------------
Deflate壓縮: 2167
Deflate解壓縮: 3850
對(duì)象是否相等: true
Process finished with exit code 08 總結(jié)
通過上面幾節(jié)的學(xué)習(xí),對(duì)lz4有了大致的了解,它的壓縮和解壓縮效率是非常好的,壓縮比相較于其他壓縮工具來講并不是很突出,其壓縮比取決于壓縮內(nèi)容的重復(fù)率。
在壓縮場(chǎng)景中,選擇合適的壓縮工具,各種壓縮工具均有其利弊,揚(yáng)其長(zhǎng)、避其短,才能使得我們的工作更有效。
以上為個(gè)人經(jīng)驗(yàn),希望能給大家一個(gè)參考,也希望大家多多支持腳本之家。
相關(guān)文章
Java 自動(dòng)安裝校驗(yàn)TLS/SSL證書
這篇文章主要介紹了Java 自動(dòng)安裝校驗(yàn)TLS/SSL證書的示例,幫助大家更好的理解和使用Java,感興趣的朋友可以了解下2020-10-10
java基礎(chǔ)之String知識(shí)總結(jié)
今天帶大家來回顧一下Java基礎(chǔ),文中詳細(xì)總結(jié)了String的相關(guān)知識(shí),對(duì)正在學(xué)習(xí)java基礎(chǔ)的小伙伴們有很好的幫助,需要的朋友可以參考下2021-05-05
SpringBoot自定義Starter與自動(dòng)配置實(shí)現(xiàn)方法詳解
在Spring Boot官網(wǎng)為了簡(jiǎn)化我們的開發(fā),已經(jīng)提供了非常多場(chǎng)景的Starter來為我們使用,即便如此,也無法全面的滿足我們實(shí)際工作中的開發(fā)場(chǎng)景,這時(shí)我們就需要自定義實(shí)現(xiàn)定制化的Starter2023-02-02
jasypt 集成SpringBoot 數(shù)據(jù)庫(kù)密碼加密操作
這篇文章主要介紹了jasypt 集成SpringBoot 數(shù)據(jù)庫(kù)密碼加密操作,具有很好的參考價(jià)值,希望對(duì)大家有所幫助。一起跟隨小編過來看看吧2020-11-11

