使用java實(shí)現(xiàn)百萬級別數(shù)據(jù)導(dǎo)出excel的三種方式
在業(yè)務(wù)系統(tǒng)中,導(dǎo)出報表的需求會很常見,而隨著時間推移業(yè)務(wù)量不斷增加,數(shù)據(jù)庫的數(shù)據(jù)可能達(dá)到百萬甚至千萬級別。對于導(dǎo)出報表功能,最簡單的做法就是從數(shù)據(jù)庫里面把需要的數(shù)據(jù)一次性加載到內(nèi)存,然后寫入excel文件,再把excel文件返回給用戶。這種做法在數(shù)據(jù)量不大的時候是可行的,但是一旦需要導(dǎo)出幾十萬甚至上百萬的數(shù)據(jù),很可能出現(xiàn)OOM導(dǎo)致服務(wù)崩潰的情況,而且導(dǎo)出所消耗的時間會大大增加。
這里提供一種支持百萬級別數(shù)據(jù)導(dǎo)出的方法,并且消耗很少的內(nèi)存,核心思想就是不要一次性把數(shù)據(jù)加載到內(nèi)存中。
主要是從兩個方面去解決:
1.從數(shù)據(jù)庫加載數(shù)據(jù)不要一次性加載,可以分頁的方式或者用游標(biāo)的方式分批加載數(shù)據(jù),加載一批數(shù)據(jù)處理一批并且釋放內(nèi)存,這樣內(nèi)存占用始終處于一個比較平穩(wěn)的狀態(tài)。分頁的方式加載編碼比較繁瑣,我一般是采用游標(biāo)方式逐行加載。目前常用的持久層框架有JPA,mybaits,hibernate,下面會分別列出JPA,hibernate及mybatis通過游標(biāo)方式加載數(shù)據(jù)。
2.寫入excel也是分批寫入,推薦阿里的EasyExcel,占用內(nèi)存極低。
EasyExcel的pom依賴:
<dependency> <groupId>com.alibaba</groupId> <artifactId>easyexcel</artifactId> <version>2.1.1</version> <optional>true</optional> </dependency>
運(yùn)行環(huán)境
jdk1.8,idea2019,堆內(nèi)存:-Xms256M -Xms256M(導(dǎo)出100萬數(shù)據(jù)毫無壓力),springboot,數(shù)據(jù)庫是mysql
先來張效果圖,這個是最大堆內(nèi)存設(shè)置為256M,兩張表聯(lián)合查詢的情況下導(dǎo)出100萬數(shù)據(jù)的效果,可以看到堆內(nèi)存變化比較平穩(wěn),導(dǎo)出100萬數(shù)據(jù)耗時143秒,這個速度還有優(yōu)化的空間,如果是單表導(dǎo)出的話速度會更快些:

JPA使用游標(biāo)方式導(dǎo)出百萬數(shù)據(jù)(兩種方式,推薦使用QueryDSL)
1.使用jpa原生方式,這種適合sql比較簡單的情況
pom.xml:
<!-- spring web依賴,搭建web項(xiàng)目需要這個依賴-->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<!-- jpa -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-jpa</artifactId>
</dependency>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<optional>true</optional>
</dependency>
repository:
@Repository
public interface UserRepository extends JpaRepository<UserEntity,Integer> {
//@QueryHint(name = HINT_FETCH_SIZE,value = Integer.MIN_VALUE+"") 值設(shè)置為Integer.MIN_VALUE告訴mysql需要逐條返回數(shù)據(jù),并且返回值需要用stream來接收
@QueryHints(@QueryHint(name = HINT_FETCH_SIZE,value = Integer.MIN_VALUE+""))
@Query(value = "select * from user limit 500000",nativeQuery = true)
Stream<UserEntity> findAllList();
}
service:
注意:
需要加事務(wù)注解,并且是只讀事務(wù)
需要及時調(diào)用entityManager的detach方法釋放內(nèi)存,不然還是會出現(xiàn)OOM
@Autowired
private EntityManager entityManager;
@Autowired
private UserRepository userRepository;
Transactional(readOnly = true)
public void exportData3(ScrollResultsHandler<UserExportVO> scrollResultsHandler){
Stream<UserEntity> allList = userRepository.findAllList.forEach((o)->{
UserEntity userEntity = (UserEntity) o;
UserExportVO userExportVO = UserExportVO.builer()
.userName(userEntity.getUsername())
.mobile(userEntity.getMobile())
.build();
scrollResultsHandler.handle(userExportVO);
//對象被session持有,調(diào)用detach方法釋放內(nèi)存
entityManager.detach(userEntity);
});
}
controller:
@RequestMapping("export4")
public void export4(HttpServletResponse response) throws IOException {
response.setContentType("application/vnd.ms-excel");
response.setCharacterEncoding("utf-8");
String filenames="bigdata4";
response.addHeader("Content-Disposition", "filename=" + filenames + ".xlsx");
ExcelWriter excelWriter = EasyExcel.write(response.getOutputStream(), UserExportVO.class).build();
WriteSheet[] writeSheet = new WriteSheet[] {
EasyExcel.writerSheet(0, "sheet").build()
};
userService.exportData(s->{
UserExportVO resultObject = s;
ArrayList arrayList = new ArrayList<UserExportVO>();
arrayList.add(resultObject);
excelWriter.write(arrayList, writeSheet[0]);
});
excelWriter.finish();
}
使用到的相關(guān)的類:
/**
* @author 奔騰的野馬
* @date 2022/04/25 09:12
*/
public interface ScrollResultsHandler<T> {
void handle(T t);
}
import com.alibaba.excel.annotation.ExcelProperty;
import lombok.AllArgsConstructor;
import lombok.Builder;
import lombok.Data;
import lombok.NoArgsConstructor;
import java.math.BigDecimal;
import java.time.LocalDateTime;
/**
* @Author: 奔騰的野馬
* @Date: 2021/10/16 16:19
*/
@Data
@Builder
@AllArgsConstructor
@NoArgsConstructor
public class UserExportVO {
@ExcelProperty(value = "用戶名")
private String userName;
@ExcelProperty(value = "手機(jī)號")
private String mobile;
}
2.使用querysql游標(biāo)方式導(dǎo)出,推薦這種方式,可以實(shí)現(xiàn)動態(tài)sql,多表關(guān)聯(lián)甚至是常見的組函數(shù)都可以支持
pom.xml:
<dependencies>
<!-- spring web依賴,搭建web項(xiàng)目需要這個依賴-->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-jpa</artifactId>
</dependency>
<!--QueryDSL支持-->
<dependency>
<groupId>com.querydsl</groupId>
<artifactId>querydsl-apt</artifactId>
<version>5.0.0</version>
<scope>provided</scope>
</dependency>
<!--QueryDSL支持-->
<dependency>
<groupId>com.querydsl</groupId>
<artifactId>querydsl-jpa</artifactId>
<version>5.0.0</version>
</dependency>
<dependency>
<groupId>com.querydsl</groupId>
<artifactId>querydsl-core</artifactId>
<version>5.0.0</version>
</dependency>
</dependencies>
<build>
<plugins>
<!-- QueryDSL 插件 -->
<plugin>
<groupId>com.mysema.maven</groupId>
<artifactId>apt-maven-plugin</artifactId>
<version>1.1.3</version>
<executions>
<execution>
<goals>
<goal>process</goal>
</goals>
<configuration>
<outputDirectory>target/generated-sources/java</outputDirectory>
<processor>com.querydsl.apt.jpa.JPAAnnotationProcessor</processor>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
service:
@Autowired
private JPAQueryFactory jpaQueryFactory;
private QUserEntity qUserEntity = QUserEntity.userEntity;
@Transactional(readOnly = true)
public void exportData2(ScrollResultsHandler<UserExportVO> scrollResultsHandler){
//需要用stream方式接收,這樣才能逐條處理
Stream<UserExportVO> userExportVOStream = jpaQueryFactory.select(Projections.bean(UserExportVO.class
, qUserEntity.userName, qUserEntity.mobile))
.from(qUserEntity)
//.join(xxxEntity)
//.on(xxxx)
//setHint(HINT_FETCH_SIZE,Integer.MIN_VALUE+"") 告訴mysql需要逐條返回數(shù)據(jù),注意值需要設(shè)置為Integer.MIN_VALUE才能生效
.setHint(HINT_FETCH_SIZE,Integer.MIN_VALUE+"")
.limit(1000000)
.stream();
userExportVOStream.forEach(dto->{
scrollResultsHandler.handle(dto);
});
}
controller:
@RequestMapping("export4")
public void export4(HttpServletResponse response) throws IOException {
response.setContentType("application/vnd.ms-excel");
response.setCharacterEncoding("utf-8");
String filenames="bigdata4";
response.addHeader("Content-Disposition", "filename=" + filenames + ".xlsx");
ExcelWriter excelWriter = EasyExcel.write(response.getOutputStream(), UserExportVO.class).build();
WriteSheet[] writeSheet = new WriteSheet[] {
EasyExcel.writerSheet(0, "sheet").build()
};
userService.exportData(s->{
UserExportVO resultObject = s;
ArrayList arrayList = new ArrayList<UserExportVO>();
arrayList.add(resultObject);
excelWriter.write(arrayList, writeSheet[0]);
});
excelWriter.finish();
}
mybatis使用游標(biāo)方式導(dǎo)出百萬數(shù)據(jù)
pom.xml:
<dependency> <groupId>org.mybatis</groupId> <artifactId>mybatis</artifactId> <version>3.5.9</version> </dependency>
dao:
/**
* @author 奔騰的野馬
* @date 2022/04/16 19:14
*/
@Mapper
public interface UserDao {
//ResultSetType.TYPE_FORWORD_ONLY 結(jié)果集的游標(biāo)只能向下滾動,fetchSize需要設(shè)置為Integer.MIN_VALUE游標(biāo)才能生效
@Options(resultSetType = ResultSetType.FORWARD_ONLY,fetchSize = Integer.MIN_VALUE)
@ResultType(UserExportVO.class)
@Select("select userName,mobile from user limit 500000")
void reportAll2(ResultHandler<UserExportVO> handler);
}
service:
@Transactional(readOnly = true)
public void export2(ResultHandler<UserExportVO> handler){
userDao.reportAll2(handler);
}
controller:
同上
hibernate使用游標(biāo)方式導(dǎo)出百萬數(shù)據(jù)
service:
@Autowired
private EntityManager entityManager;
public void exportData(ScrollResultsHandler<UserExportVO> scrollResultsHandler){
//當(dāng)不需要緩存時,最好使用StatelessSession
StatelessSession session = ((Session) entityManager.getDelegate()).getSessionFactory().openStatelessSession();
Query query = session.getNamedQuery("getAllList");
query.setCacheMode(CacheMode.IGNORE);
//setFetchSize(Integer.MIN_VALUE)告訴mysql逐條返回數(shù)據(jù)
query.setFetchSize(Integer.MIN_VALUE);
query.setFirstResult(0);
query.setMaxResults(1000000);
query.setReadOnly(true);
query.setLockMode("a", LockMode.NONE);
//ScrollMode.TYPE_FORWORD_ONLY 結(jié)果集的游標(biāo)只能向下滾動
ScrollableResults results = query.scroll(ScrollMode.FORWARD_ONLY);
while (results.next()) {
UserEntity userEntity = (UserEntity) results.get(0);
UserExportVO userExportVO = UserExportVO.builer()
.userName(userEntity.getUsername())
.mobile(userEntity.getMobile())
.build();
scrollResultsHandler.handle(userExportVO);
}
results.close();
session.close();
}
controller:
同上
流式導(dǎo)出遇到的問題(基于jpa和querydsl)及解決方法
1.導(dǎo)出過程中堆內(nèi)存占用急劇上升,發(fā)生OOM
1.1 項(xiàng)目中使用了log4jdbc-log4j2-jdbc4.1(版本是1.16),驅(qū)動為net.sf.log4jdbc.sql.jdbcapi.DriverSpy,改成mysql的原生驅(qū)動就好了。"log4jdbc-log4j2-jdbc4.1"本來是用來開發(fā)過程中方便打印sql的,結(jié)果卻帶來了OOM問題,看來使用第三方j(luò)ar包一定要慎重啊。
1.2 項(xiàng)目的存在多個版本的querydsl,jar包沖突,解決jar包就正常了
1.3 二次查詢時,hibernate的一級緩存沒有及時釋放,進(jìn)一步分析,發(fā)現(xiàn)大量的對象都被緩存在(org.hibernate.engine.StatefulPersistenceContext)中,導(dǎo)致一級緩存泄漏
解決方法:
由于 Hibernate 的一級緩存是其內(nèi)部使用的,無法關(guān)閉或停用(隨著Session 銷毀)。從Hibernate 的手冊或文檔中可知,Hibernate 的一級緩存的清除可通過以下方式:
1)對于單個對象的清除:
Session session=sessionFactory.getCurrentSession(); session.evict(entity);
2)對于實(shí)體集合的清除:
Session session=sessionFactory.getCurrentSession(); session.clear();建議在程序中加入對 Hibernate 一級緩存的清除工作,以便可以其內(nèi)存數(shù)據(jù)可以及時釋放。
2.導(dǎo)出過程中遍歷stream需要二次查詢數(shù)據(jù)庫
導(dǎo)出過程中遍歷stream需要二次查詢數(shù)據(jù)庫時提示"Streaming result set com.mysql.cj.protocol.a.result.ResultsetRowsStreaming@5800daf5 is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries"
錯誤詳細(xì)內(nèi)容:
java.sql.SQLException: Streaming result set com.mysql.cj.protocol.a.result.ResultsetRowsStreaming@3b8732ec is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries. at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:129) at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97) at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122) at com.mysql.cj.jdbc.ClientPreparedStatement.executeInternal(ClientPreparedStatement.java:953) at com.mysql.cj.jdbc.ClientPreparedStatement.executeQuery(ClientPreparedStatement.java:1003) at net.sf.log4jdbc.sql.jdbcapi.PreparedStatementSpy.executeQuery(PreparedStatementSpy.java:780) at com.zaxxer.hikari.pool.ProxyPreparedStatement.executeQuery(ProxyPreparedStatement.java:52) at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.executeQuery(HikariProxyPreparedStatement.java) at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:57) at org.hibernate.loader.Loader.getResultSet(Loader.java:2292) at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:2050) at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:2012) at org.hibernate.loader.Loader.doQuery(Loader.java:953) at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:354) at org.hibernate.loader.Loader.doList(Loader.java:2838) at org.hibernate.loader.Loader.doList(Loader.java:2820) at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2652) at org.hibernate.loader.Loader.list(Loader.java:2647) at org.hibernate.loader.hql.QueryLoader.list(QueryLoader.java:506) at org.hibernate.hql.internal.ast.QueryTranslatorImpl.list(QueryTranslatorImpl.java:396) at org.hibernate.engine.query.spi.HQLQueryPlan.performList(HQLQueryPlan.java:219) at org.hibernate.internal.SessionImpl.list(SessionImpl.java:1404) at org.hibernate.query.internal.AbstractProducedQuery.doList(AbstractProducedQuery.java:1562) at org.hibernate.query.internal.AbstractProducedQuery.list(AbstractProducedQuery.java:1530) at org.hibernate.query.internal.AbstractProducedQuery.getSingleResult(AbstractProducedQuery.java:1578) at org.hibernate.query.criteria.internal.compile.CriteriaQueryTypeQueryAdapter.getSingleResult(CriteriaQueryTypeQueryAdapter.java:111) at org.springframework.data.jpa.repository.query.JpaQueryExecution$SingleEntityExecution.doExecute(JpaQueryExecution.java:196) at org.springframework.data.jpa.repository.query.JpaQueryExecution.execute(JpaQueryExecution.java:88) at org.springframework.data.jpa.repository.query.AbstractJpaQuery.doExecute(AbstractJpaQuery.java:154) at org.springframework.data.jpa.repository.query.AbstractJpaQuery.execute(AbstractJpaQuery.java:142) at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.doInvoke(RepositoryFactorySupport.java:618) at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.invoke(RepositoryFactorySupport.java:605) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) at org.springframework.data.projection.DefaultMethodInvokingMethodInterceptor.invoke(DefaultMethodInvokingMethodInterceptor.java:80) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:366) at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:99) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:139) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) at org.springframework.data.jpa.repository.support.CrudMethodMetadataPostProcessor$CrudMethodMetadataPopulatingMethodInterceptor.invoke(CrudMethodMetadataPostProcessor.java:149) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:95) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212) at com.sun.proxy.$Proxy223.countAllByDeliveryNo(Unknown Source)
查閱資料后發(fā)現(xiàn),是mysql不支持在流式查詢過程中使用同一連接再次查詢數(shù)據(jù)庫

解決方法:
方法1.使用異步方法查詢,這樣就可以規(guī)避同一個連接二次查詢的問題
方法2.需要二次查詢時開啟一個新的事務(wù)去查詢就可以,spring中可以使用事務(wù)注解開啟新的事務(wù)就搞定了,注解如下:
@Transactional(propagation = Propagation.REQUIRES_NEW,readOnly = true)
到此這篇關(guān)于使用java實(shí)現(xiàn)百萬級別數(shù)據(jù)導(dǎo)出excel的三種方式的文章就介紹到這了,更多相關(guān)java百萬級別數(shù)據(jù)導(dǎo)出excel內(nèi)容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家!
相關(guān)文章
Java數(shù)據(jù)結(jié)構(gòu)之LinkedList的用法詳解
鏈表(Linked?list)是一種常見的基礎(chǔ)數(shù)據(jù)結(jié)構(gòu),是一種線性表。Java的LinkedList(鏈表)?類似于?ArrayList,是一種常用的數(shù)據(jù)容器,本文就來簡單講講它的使用吧2023-05-05
解決mybatisplus插入報錯argument type mismatch的問題
這篇文章主要介紹了解決mybatisplus插入報錯argument type mismatch的問題,具有很好的參考價值,希望對大家有所幫助。一起跟隨小編過來看看吧2020-11-11
struts2.3.24+spring4.1.6+hibernate4.3.11+mysql5.5.25開發(fā)環(huán)境搭建圖文
這篇文章主要介紹了struts2.3.24+spring4.1.6+hibernate4.3.11+mysql5.5.25開發(fā)環(huán)境搭建圖文教程,感興趣的小伙伴們可以參考一下2016-06-06
spring?boot?executable?jar/war?原理解析
spring boot里其實(shí)不僅可以直接以 java -jar demo.jar的方式啟動,還可以把jar/war變?yōu)橐粋€可以執(zhí)行的腳本來啟動,比如./demo.jar,這篇文章主要介紹了spring?boot?executable?jar/war?原理,需要的朋友可以參考下2023-02-02
SpringSecurityOAuth2實(shí)現(xiàn)微信授權(quán)登錄
微信的登錄功能是用戶注冊和使用微信的必經(jīng)之路之一,而微信授權(quán)登錄更是方便了用戶的登錄操作,本文主要介紹了SpringSecurityOAuth2實(shí)現(xiàn)微信授權(quán)登錄,感興趣的可以了解一下2023-09-09
MyBatis標(biāo)簽之Select?resultType和resultMap詳解
這篇文章主要介紹了MyBatis標(biāo)簽之Select?resultType和resultMap,在MyBatis中有一個ResultMap標(biāo)簽,它是為了映射select標(biāo)簽查詢出來的結(jié)果集,下面使用一個簡單的例子,來介紹 resultMap 的使用方法,需要的朋友可以參考下2022-09-09
springboot配置templates直接訪問的實(shí)現(xiàn)
這篇文章主要介紹了springboot配置templates直接訪問的實(shí)現(xiàn)方式,具有很好的參考價值,希望對大家有所幫助。如有錯誤或未考慮完全的地方,望不吝賜教2021-12-12
教你如何使用google.zxing結(jié)合springboot生成二維碼功能
這篇文章主要介紹了使用google.zxing結(jié)合springboot生成二維碼功能,我們使用兩種方式,去生成二維碼,但是其實(shí),二維碼的生成基礎(chǔ),都是zxing包,這是Google開源的一個包,本文通過實(shí)例代碼給大家介紹的非常詳細(xì),需要的朋友可以參考下2022-05-05

