MySQL之批量插入的4種方案總結(jié)

更新時(shí)間：2023年05月19日 14:06:59 作者：莫輕言舞

這篇文章主要介紹了MySQL之批量插入的4種方案總結(jié)，具有很好的參考價(jià)值，希望對大家有所幫助。如有錯(cuò)誤或未考慮完全的地方，望不吝賜教

一、前言

最近趁空閑之余，在對MySQL數(shù)據(jù)庫進(jìn)行插入數(shù)據(jù)測試，對于如何快速插入數(shù)據(jù)的操作無從下手，在僅1W數(shù)據(jù)量的情況下，竟花費(fèi)接近47s，實(shí)在不忍直視！在不斷摸索之后，整理出一些較實(shí)用的方案。

二、準(zhǔn)備工作

測試環(huán)境：SpringBoot項(xiàng)目、MyBatis-Plus框架、MySQL8.0.24、JDK13

前提：SpringBoot項(xiàng)目集成MyBatis-Plus上述文章有配置過程，同時(shí)實(shí)現(xiàn)IService接口用于進(jìn)行批量插入數(shù)據(jù)操作saveBatch()方法

1、Maven項(xiàng)目中pom.xml文件引入的相關(guān)依賴如下

<dependencies>
  <!-- SpringBoot Web模塊依賴 -->
  <dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-web</artifactId>
  </dependency>
  <!-- MyBatis-Plus 依賴 -->
  <dependency>
    <groupId>com.baomidou</groupId>
    <artifactId>mybatis-plus-boot-starter</artifactId>
    <version>3.3.1</version>
  </dependency>
  <!-- 數(shù)據(jù)庫連接驅(qū)動(dòng) -->
  <dependency>
    <groupId>mysql</groupId>
    <artifactId>mysql-connector-java</artifactId>
  </dependency>
  <!-- 使用注解，簡化代碼-->
  <dependency>
    <groupId>org.projectlombok</groupId>
    <artifactId>lombok</artifactId>
  </dependency>
</dependencies>

2、application.yml配置屬性文件內(nèi)容（重點(diǎn)：開啟批處理模式）

server:
    端口號(hào) 
    port: 8080
#  MySQL連接配置信息（以下僅簡單配置，更多設(shè)置可自行查看）
spring:
    datasource:
         連接地址（解決UTF-8中文亂碼問題 + 時(shí)區(qū)校正）
                （rewriteBatchedStatements=true 開啟批處理模式）
        url: jdbc:mysql://127.0.0.1:3306/bjpowernode?useUnicode=true&characterEncoding=UTF-8&serverTimezone=Asia/Shanghai&rewriteBatchedStatements=true
         用戶名
        username: root
         密碼
        password: xxx
         連接驅(qū)動(dòng)名稱
        driver-class-name: com.mysql.cj.jdbc.Driver

3、Entity實(shí)體類（測試）

/**
 *   Student 測試實(shí)體類
 *   
 *   @Data注解：引入Lombok依賴，可省略Setter、Getter方法
 */
@Data
@TableName(value = "student")
public class Student {
    /**  主鍵  type:自增 */
    @TableId(type = IdType.AUTO)
    private int id;
    /**  名字 */
    private String name;
    /**  年齡 */
    private int age;
    /**  地址 */
    private String addr;
    /**  地址號(hào)  @TableField：與表字段映射 */
    @TableField(value = "addr_num")
    private String addrNum;
    public Student(String name, int age, String addr, String addrNum) {
        this.name = name;
        this.age = age;
        this.addr = addr;
        this.addrNum = addrNum;
    }
}

4、數(shù)據(jù)庫student表結(jié)構(gòu)（注意：無索引）

三、測試工作

簡明：完成準(zhǔn)備工作后，即對for循環(huán)、拼接SQL語句、批量插入saveBatch()、循環(huán)插入+開啟批處理模式，該4種插入數(shù)據(jù)的方式進(jìn)行測試性能。

注意：測試數(shù)據(jù)量為5W、單次測試完清空數(shù)據(jù)表（確保不受舊數(shù)據(jù)影響）

（以下測試內(nèi)容可能受測試配置環(huán)境、測試規(guī)范和數(shù)據(jù)量等諸多因素影響，讀者可自行結(jié)合參考進(jìn)行測試）

1、for循環(huán)插入（單條）（總耗時(shí)：177秒）

總結(jié)：測試平均時(shí)間約是177秒，實(shí)在是不忍直視（捂臉），因?yàn)槔胒or循環(huán)進(jìn)行單條插入時(shí)，每次都是在獲取連接(Connection)、釋放連接和資源關(guān)閉等操作上，（如果數(shù)據(jù)量大的情況下）極其消耗資源，導(dǎo)致時(shí)間長。

@GetMapping("/for")
public void forSingle(){
    // 開始時(shí)間
    long startTime = System.currentTimeMillis();
    for (int i = 0; i < 50000; i++){
        Student student = new Student("李毅" + i,24,"張家界市" + i,i + "號(hào)");
        studentMapper.insert(student);
    }
    // 結(jié)束時(shí)間
    long endTime = System.currentTimeMillis();
    System.out.println("插入數(shù)據(jù)消耗時(shí)間：" + (endTime - startTime));
}

（1）第一次測試結(jié)果：190155 約等于 190秒

（2）第二次測試結(jié)果：175926 約等于 176秒（服務(wù)未重啟）

（3）第三次測試結(jié)果：174726 約等于 174秒（服務(wù)重啟）

2、拼接SQL語句（總耗時(shí)：2.9秒）

簡明：拼接格式：insert into student(xxxx) value(xxxx),(xxxx),(xxxxx).......

總結(jié)：拼接結(jié)果就是將所有的數(shù)據(jù)集成在一條SQL語句的value值上，其由于提交到服務(wù)器上的insert語句少了，網(wǎng)絡(luò)負(fù)載少了，性能也就提上去。但是當(dāng)數(shù)據(jù)量上去后，可能會(huì)出現(xiàn)內(nèi)存溢出、解析SQL語句耗時(shí)等情況，但與第一點(diǎn)相比，提高了極大的性能。

@GetMapping("/sql")
public void sql(){
    ArrayList<Student> arrayList = new ArrayList<>();
    long startTime = System.currentTimeMillis();
    for (int i = 0; i < 50000; i++){
        Student student = new Student("李毅" + i,24,"張家界市" + i,i + "號(hào)");
        arrayList.add(student);
    }
    studentMapper.insertSplice(arrayList);
    long endTime = System.currentTimeMillis();
    System.out.println("插入數(shù)據(jù)消耗時(shí)間：" + (endTime - startTime));
}

// 使用@Insert注解插入：此處為簡便，不寫Mapper.xml文件
@Insert("<script>" +
        "insert into student (name,age,addr,addr_num) values " +
        "<foreach collection='studentList' item='item' separator=','> " +
        "(#{item.name},{item.age},{item.addr},{item.addrNum}) " +
        "</foreach> " +
        "</script>")
int insertSplice(@Param("studentList") List<Student> studentList);

（1）第一次測試結(jié)果：3218 約等于 3.2秒

（2）第二次測試結(jié)果：2592 約等于 2.6秒（服務(wù)未重啟）

（3）第三次測試結(jié)果：3082 約等于 3.1秒（服務(wù)重啟）

3、批量插入saveBatch（總耗時(shí)：2.7秒）

簡明：使用MyBatis-Plus實(shí)現(xiàn)IService接口中批處理saveBatch()方法，對底層源碼進(jìn)行查看時(shí)，可發(fā)現(xiàn)其實(shí)是for循環(huán)插入，但是與第一點(diǎn)相比，為什么性能上提高了呢？因?yàn)槔梅制幚恚╞atchSize = 1000） + 分批提交事務(wù)的操作，從而提高性能，并非在Connection上消耗性能。

@GetMapping("/saveBatch1")
public void saveBatch1(){
    ArrayList<Student> arrayList = new ArrayList<>();
    long startTime = System.currentTimeMillis();
    // 模擬數(shù)據(jù)
    for (int i = 0; i < 50000; i++){
        Student student = new Student("李毅" + i,24,"張家界市" + i,i + "號(hào)");
        arrayList.add(student);
    }
    // 批量插入
    studentService.saveBatch(arrayList);
    long endTime = System.currentTimeMillis();
    System.out.println("插入數(shù)據(jù)消耗時(shí)間：" + (endTime - startTime));
}

（1）第一次測試結(jié)果：2864 約等于 2.9秒

（2）第二次測試結(jié)果：2302 約等于 2.3秒（服務(wù)未重啟）

（3）第三次測試結(jié)果：2893 約等于 2.9秒（服務(wù)重啟）

重點(diǎn)注意：MySQL JDBC驅(qū)動(dòng)默認(rèn)情況下忽略saveBatch()方法中的executeBatch()語句，將需要批量處理的一組SQL語句進(jìn)行拆散，執(zhí)行時(shí)一條一條給MySQL數(shù)據(jù)庫，造成實(shí)際上是分片插入，即與單條插入方式相比，有提高，但是性能未能得到實(shí)質(zhì)性的提高。

測試：數(shù)據(jù)庫連接URL地址缺少 rewriteBatchedStatements = true 參數(shù)情況

#  MySQL連接配置信息
spring:
    datasource:
         連接地址（未開啟批處理模式）
        url: jdbc:mysql://127.0.0.1:3306/bjpowernode?useUnicode=true&characterEncoding=UTF-8&serverTimezone=Asia/Shanghai
         用戶名
        username: root
         密碼
        password: xxx
         連接驅(qū)動(dòng)名稱
        driver-class-name: com.mysql.cj.jdbc.Driver

測試結(jié)果：10541 約等于 10.5秒（未開啟批處理模式）

4、循環(huán)插入 + 開啟批處理模式（總耗時(shí)：1.7秒）（重點(diǎn)：一次性提交）

簡明：開啟批處理，關(guān)閉自動(dòng)提交事務(wù)，共用同一個(gè)SqlSession之后，for循環(huán)單條插入的性能得到實(shí)質(zhì)性的提高；由于同一個(gè)SqlSession省去對資源相關(guān)操作的耗能、減少對事務(wù)處理的時(shí)間等，從而極大程度上提高執(zhí)行效率。（目前個(gè)人覺得最優(yōu)方案）

@GetMapping("/forSaveBatch")
public void forSaveBatch(){
    //  開啟批量處理模式 BATCH 、關(guān)閉自動(dòng)提交事務(wù) false
    SqlSession sqlSession = sqlSessionFactory.openSession(ExecutorType.BATCH,false);
    //  反射獲取，獲取Mapper
    StudentMapper studentMapper = sqlSession.getMapper(StudentMapper.class);
    long startTime = System.currentTimeMillis();
    for (int i = 0 ; i < 50000 ; i++){
        Student student = new Student("李毅" + i,24,"張家界市" + i,i + "號(hào)");
        studentMapper.insertStudent(student);
    }
    // 一次性提交事務(wù)
    sqlSession.commit();
    // 關(guān)閉資源
    sqlSession.close();
    long endTime = System.currentTimeMillis();
    System.out.println("總耗時(shí)： " + (endTime - startTime));
}

（1）第一次測試結(jié)果：1831 約等于 1.8秒