使用java實(shí)現(xiàn)百萬(wàn)級(jí)別數(shù)據(jù)導(dǎo)出excel的三種方式
在業(yè)務(wù)系統(tǒng)中,導(dǎo)出報(bào)表的需求會(huì)很常見(jiàn),而隨著時(shí)間推移業(yè)務(wù)量不斷增加,數(shù)據(jù)庫(kù)的數(shù)據(jù)可能達(dá)到百萬(wàn)甚至千萬(wàn)級(jí)別。對(duì)于導(dǎo)出報(bào)表功能,最簡(jiǎn)單的做法就是從數(shù)據(jù)庫(kù)里面把需要的數(shù)據(jù)一次性加載到內(nèi)存,然后寫(xiě)入excel文件,再把excel文件返回給用戶(hù)。這種做法在數(shù)據(jù)量不大的時(shí)候是可行的,但是一旦需要導(dǎo)出幾十萬(wàn)甚至上百萬(wàn)的數(shù)據(jù),很可能出現(xiàn)OOM導(dǎo)致服務(wù)崩潰的情況,而且導(dǎo)出所消耗的時(shí)間會(huì)大大增加。
這里提供一種支持百萬(wàn)級(jí)別數(shù)據(jù)導(dǎo)出的方法,并且消耗很少的內(nèi)存,核心思想就是不要一次性把數(shù)據(jù)加載到內(nèi)存中。
主要是從兩個(gè)方面去解決:
1.從數(shù)據(jù)庫(kù)加載數(shù)據(jù)不要一次性加載,可以分頁(yè)的方式或者用游標(biāo)的方式分批加載數(shù)據(jù),加載一批數(shù)據(jù)處理一批并且釋放內(nèi)存,這樣內(nèi)存占用始終處于一個(gè)比較平穩(wěn)的狀態(tài)。分頁(yè)的方式加載編碼比較繁瑣,我一般是采用游標(biāo)方式逐行加載。目前常用的持久層框架有JPA,mybaits,hibernate,下面會(huì)分別列出JPA,hibernate及mybatis通過(guò)游標(biāo)方式加載數(shù)據(jù)。
2.寫(xiě)入excel也是分批寫(xiě)入,推薦阿里的EasyExcel,占用內(nèi)存極低。
EasyExcel的pom依賴(lài):
<dependency> <groupId>com.alibaba</groupId> <artifactId>easyexcel</artifactId> <version>2.1.1</version> <optional>true</optional> </dependency>
運(yùn)行環(huán)境
jdk1.8,idea2019,堆內(nèi)存:-Xms256M -Xms256M(導(dǎo)出100萬(wàn)數(shù)據(jù)毫無(wú)壓力),springboot,數(shù)據(jù)庫(kù)是mysql
先來(lái)張效果圖,這個(gè)是最大堆內(nèi)存設(shè)置為256M,兩張表聯(lián)合查詢(xún)的情況下導(dǎo)出100萬(wàn)數(shù)據(jù)的效果,可以看到堆內(nèi)存變化比較平穩(wěn),導(dǎo)出100萬(wàn)數(shù)據(jù)耗時(shí)143秒,這個(gè)速度還有優(yōu)化的空間,如果是單表導(dǎo)出的話(huà)速度會(huì)更快些:
JPA使用游標(biāo)方式導(dǎo)出百萬(wàn)數(shù)據(jù)(兩種方式,推薦使用QueryDSL)
1.使用jpa原生方式,這種適合sql比較簡(jiǎn)單的情況
pom.xml:
<!-- spring web依賴(lài),搭建web項(xiàng)目需要這個(gè)依賴(lài)--> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <!-- jpa --> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-data-jpa</artifactId> </dependency> <dependency> <groupId>org.projectlombok</groupId> <artifactId>lombok</artifactId> <optional>true</optional> </dependency>
repository:
@Repository public interface UserRepository extends JpaRepository<UserEntity,Integer> { //@QueryHint(name = HINT_FETCH_SIZE,value = Integer.MIN_VALUE+"") 值設(shè)置為Integer.MIN_VALUE告訴mysql需要逐條返回?cái)?shù)據(jù),并且返回值需要用stream來(lái)接收 @QueryHints(@QueryHint(name = HINT_FETCH_SIZE,value = Integer.MIN_VALUE+"")) @Query(value = "select * from user limit 500000",nativeQuery = true) Stream<UserEntity> findAllList(); }
service:
注意:
需要加事務(wù)注解,并且是只讀事務(wù)
需要及時(shí)調(diào)用entityManager的detach方法釋放內(nèi)存,不然還是會(huì)出現(xiàn)OOM
@Autowired private EntityManager entityManager; @Autowired private UserRepository userRepository; Transactional(readOnly = true) public void exportData3(ScrollResultsHandler<UserExportVO> scrollResultsHandler){ Stream<UserEntity> allList = userRepository.findAllList.forEach((o)->{ UserEntity userEntity = (UserEntity) o; UserExportVO userExportVO = UserExportVO.builer() .userName(userEntity.getUsername()) .mobile(userEntity.getMobile()) .build(); scrollResultsHandler.handle(userExportVO); //對(duì)象被session持有,調(diào)用detach方法釋放內(nèi)存 entityManager.detach(userEntity); }); }
controller:
@RequestMapping("export4") public void export4(HttpServletResponse response) throws IOException { response.setContentType("application/vnd.ms-excel"); response.setCharacterEncoding("utf-8"); String filenames="bigdata4"; response.addHeader("Content-Disposition", "filename=" + filenames + ".xlsx"); ExcelWriter excelWriter = EasyExcel.write(response.getOutputStream(), UserExportVO.class).build(); WriteSheet[] writeSheet = new WriteSheet[] { EasyExcel.writerSheet(0, "sheet").build() }; userService.exportData(s->{ UserExportVO resultObject = s; ArrayList arrayList = new ArrayList<UserExportVO>(); arrayList.add(resultObject); excelWriter.write(arrayList, writeSheet[0]); }); excelWriter.finish(); }
使用到的相關(guān)的類(lèi):
/** * @author 奔騰的野馬 * @date 2022/04/25 09:12 */ public interface ScrollResultsHandler<T> { void handle(T t); }
import com.alibaba.excel.annotation.ExcelProperty; import lombok.AllArgsConstructor; import lombok.Builder; import lombok.Data; import lombok.NoArgsConstructor; import java.math.BigDecimal; import java.time.LocalDateTime; /** * @Author: 奔騰的野馬 * @Date: 2021/10/16 16:19 */ @Data @Builder @AllArgsConstructor @NoArgsConstructor public class UserExportVO { @ExcelProperty(value = "用戶(hù)名") private String userName; @ExcelProperty(value = "手機(jī)號(hào)") private String mobile; }
2.使用querysql游標(biāo)方式導(dǎo)出,推薦這種方式,可以實(shí)現(xiàn)動(dòng)態(tài)sql,多表關(guān)聯(lián)甚至是常見(jiàn)的組函數(shù)都可以支持
pom.xml:
<dependencies> <!-- spring web依賴(lài),搭建web項(xiàng)目需要這個(gè)依賴(lài)--> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-data-jpa</artifactId> </dependency> <!--QueryDSL支持--> <dependency> <groupId>com.querydsl</groupId> <artifactId>querydsl-apt</artifactId> <version>5.0.0</version> <scope>provided</scope> </dependency> <!--QueryDSL支持--> <dependency> <groupId>com.querydsl</groupId> <artifactId>querydsl-jpa</artifactId> <version>5.0.0</version> </dependency> <dependency> <groupId>com.querydsl</groupId> <artifactId>querydsl-core</artifactId> <version>5.0.0</version> </dependency> </dependencies> <build> <plugins> <!-- QueryDSL 插件 --> <plugin> <groupId>com.mysema.maven</groupId> <artifactId>apt-maven-plugin</artifactId> <version>1.1.3</version> <executions> <execution> <goals> <goal>process</goal> </goals> <configuration> <outputDirectory>target/generated-sources/java</outputDirectory> <processor>com.querydsl.apt.jpa.JPAAnnotationProcessor</processor> </configuration> </execution> </executions> </plugin> </plugins> </build>
service:
@Autowired private JPAQueryFactory jpaQueryFactory; private QUserEntity qUserEntity = QUserEntity.userEntity; @Transactional(readOnly = true) public void exportData2(ScrollResultsHandler<UserExportVO> scrollResultsHandler){ //需要用stream方式接收,這樣才能逐條處理 Stream<UserExportVO> userExportVOStream = jpaQueryFactory.select(Projections.bean(UserExportVO.class , qUserEntity.userName, qUserEntity.mobile)) .from(qUserEntity) //.join(xxxEntity) //.on(xxxx) //setHint(HINT_FETCH_SIZE,Integer.MIN_VALUE+"") 告訴mysql需要逐條返回?cái)?shù)據(jù),注意值需要設(shè)置為Integer.MIN_VALUE才能生效 .setHint(HINT_FETCH_SIZE,Integer.MIN_VALUE+"") .limit(1000000) .stream(); userExportVOStream.forEach(dto->{ scrollResultsHandler.handle(dto); }); }
controller:
@RequestMapping("export4") public void export4(HttpServletResponse response) throws IOException { response.setContentType("application/vnd.ms-excel"); response.setCharacterEncoding("utf-8"); String filenames="bigdata4"; response.addHeader("Content-Disposition", "filename=" + filenames + ".xlsx"); ExcelWriter excelWriter = EasyExcel.write(response.getOutputStream(), UserExportVO.class).build(); WriteSheet[] writeSheet = new WriteSheet[] { EasyExcel.writerSheet(0, "sheet").build() }; userService.exportData(s->{ UserExportVO resultObject = s; ArrayList arrayList = new ArrayList<UserExportVO>(); arrayList.add(resultObject); excelWriter.write(arrayList, writeSheet[0]); }); excelWriter.finish(); }
mybatis使用游標(biāo)方式導(dǎo)出百萬(wàn)數(shù)據(jù)
pom.xml:
<dependency> <groupId>org.mybatis</groupId> <artifactId>mybatis</artifactId> <version>3.5.9</version> </dependency>
dao:
/** * @author 奔騰的野馬 * @date 2022/04/16 19:14 */ @Mapper public interface UserDao { //ResultSetType.TYPE_FORWORD_ONLY 結(jié)果集的游標(biāo)只能向下滾動(dòng),fetchSize需要設(shè)置為Integer.MIN_VALUE游標(biāo)才能生效 @Options(resultSetType = ResultSetType.FORWARD_ONLY,fetchSize = Integer.MIN_VALUE) @ResultType(UserExportVO.class) @Select("select userName,mobile from user limit 500000") void reportAll2(ResultHandler<UserExportVO> handler); }
service:
@Transactional(readOnly = true) public void export2(ResultHandler<UserExportVO> handler){ userDao.reportAll2(handler); }
controller:
同上
hibernate使用游標(biāo)方式導(dǎo)出百萬(wàn)數(shù)據(jù)
service:
@Autowired private EntityManager entityManager; public void exportData(ScrollResultsHandler<UserExportVO> scrollResultsHandler){ //當(dāng)不需要緩存時(shí),最好使用StatelessSession StatelessSession session = ((Session) entityManager.getDelegate()).getSessionFactory().openStatelessSession(); Query query = session.getNamedQuery("getAllList"); query.setCacheMode(CacheMode.IGNORE); //setFetchSize(Integer.MIN_VALUE)告訴mysql逐條返回?cái)?shù)據(jù) query.setFetchSize(Integer.MIN_VALUE); query.setFirstResult(0); query.setMaxResults(1000000); query.setReadOnly(true); query.setLockMode("a", LockMode.NONE); //ScrollMode.TYPE_FORWORD_ONLY 結(jié)果集的游標(biāo)只能向下滾動(dòng) ScrollableResults results = query.scroll(ScrollMode.FORWARD_ONLY); while (results.next()) { UserEntity userEntity = (UserEntity) results.get(0); UserExportVO userExportVO = UserExportVO.builer() .userName(userEntity.getUsername()) .mobile(userEntity.getMobile()) .build(); scrollResultsHandler.handle(userExportVO); } results.close(); session.close(); }
controller:
同上
流式導(dǎo)出遇到的問(wèn)題(基于jpa和querydsl)及解決方法
1.導(dǎo)出過(guò)程中堆內(nèi)存占用急劇上升,發(fā)生OOM
1.1 項(xiàng)目中使用了log4jdbc-log4j2-jdbc4.1(版本是1.16),驅(qū)動(dòng)為net.sf.log4jdbc.sql.jdbcapi.DriverSpy,改成mysql的原生驅(qū)動(dòng)就好了。"log4jdbc-log4j2-jdbc4.1"本來(lái)是用來(lái)開(kāi)發(fā)過(guò)程中方便打印sql的,結(jié)果卻帶來(lái)了OOM問(wèn)題,看來(lái)使用第三方j(luò)ar包一定要慎重啊。
1.2 項(xiàng)目的存在多個(gè)版本的querydsl,jar包沖突,解決jar包就正常了
1.3 二次查詢(xún)時(shí),hibernate的一級(jí)緩存沒(méi)有及時(shí)釋放,進(jìn)一步分析,發(fā)現(xiàn)大量的對(duì)象都被緩存在(org.hibernate.engine.StatefulPersistenceContext)中,導(dǎo)致一級(jí)緩存泄漏
解決方法:
由于 Hibernate 的一級(jí)緩存是其內(nèi)部使用的,無(wú)法關(guān)閉或停用(隨著Session 銷(xiāo)毀)。從Hibernate 的手冊(cè)或文檔中可知,Hibernate 的一級(jí)緩存的清除可通過(guò)以下方式:
1)對(duì)于單個(gè)對(duì)象的清除:
Session session=sessionFactory.getCurrentSession(); session.evict(entity);
2)對(duì)于實(shí)體集合的清除:
Session session=sessionFactory.getCurrentSession(); session.clear();建議在程序中加入對(duì) Hibernate 一級(jí)緩存的清除工作,以便可以其內(nèi)存數(shù)據(jù)可以及時(shí)釋放。
2.導(dǎo)出過(guò)程中遍歷stream需要二次查詢(xún)數(shù)據(jù)庫(kù)
導(dǎo)出過(guò)程中遍歷stream需要二次查詢(xún)數(shù)據(jù)庫(kù)時(shí)提示"Streaming result set com.mysql.cj.protocol.a.result.ResultsetRowsStreaming@5800daf5 is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries"
錯(cuò)誤詳細(xì)內(nèi)容:
java.sql.SQLException: Streaming result set com.mysql.cj.protocol.a.result.ResultsetRowsStreaming@3b8732ec is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries. at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:129) at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97) at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122) at com.mysql.cj.jdbc.ClientPreparedStatement.executeInternal(ClientPreparedStatement.java:953) at com.mysql.cj.jdbc.ClientPreparedStatement.executeQuery(ClientPreparedStatement.java:1003) at net.sf.log4jdbc.sql.jdbcapi.PreparedStatementSpy.executeQuery(PreparedStatementSpy.java:780) at com.zaxxer.hikari.pool.ProxyPreparedStatement.executeQuery(ProxyPreparedStatement.java:52) at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.executeQuery(HikariProxyPreparedStatement.java) at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:57) at org.hibernate.loader.Loader.getResultSet(Loader.java:2292) at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:2050) at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:2012) at org.hibernate.loader.Loader.doQuery(Loader.java:953) at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:354) at org.hibernate.loader.Loader.doList(Loader.java:2838) at org.hibernate.loader.Loader.doList(Loader.java:2820) at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2652) at org.hibernate.loader.Loader.list(Loader.java:2647) at org.hibernate.loader.hql.QueryLoader.list(QueryLoader.java:506) at org.hibernate.hql.internal.ast.QueryTranslatorImpl.list(QueryTranslatorImpl.java:396) at org.hibernate.engine.query.spi.HQLQueryPlan.performList(HQLQueryPlan.java:219) at org.hibernate.internal.SessionImpl.list(SessionImpl.java:1404) at org.hibernate.query.internal.AbstractProducedQuery.doList(AbstractProducedQuery.java:1562) at org.hibernate.query.internal.AbstractProducedQuery.list(AbstractProducedQuery.java:1530) at org.hibernate.query.internal.AbstractProducedQuery.getSingleResult(AbstractProducedQuery.java:1578) at org.hibernate.query.criteria.internal.compile.CriteriaQueryTypeQueryAdapter.getSingleResult(CriteriaQueryTypeQueryAdapter.java:111) at org.springframework.data.jpa.repository.query.JpaQueryExecution$SingleEntityExecution.doExecute(JpaQueryExecution.java:196) at org.springframework.data.jpa.repository.query.JpaQueryExecution.execute(JpaQueryExecution.java:88) at org.springframework.data.jpa.repository.query.AbstractJpaQuery.doExecute(AbstractJpaQuery.java:154) at org.springframework.data.jpa.repository.query.AbstractJpaQuery.execute(AbstractJpaQuery.java:142) at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.doInvoke(RepositoryFactorySupport.java:618) at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.invoke(RepositoryFactorySupport.java:605) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) at org.springframework.data.projection.DefaultMethodInvokingMethodInterceptor.invoke(DefaultMethodInvokingMethodInterceptor.java:80) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:366) at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:99) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:139) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) at org.springframework.data.jpa.repository.support.CrudMethodMetadataPostProcessor$CrudMethodMetadataPopulatingMethodInterceptor.invoke(CrudMethodMetadataPostProcessor.java:149) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:95) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212) at com.sun.proxy.$Proxy223.countAllByDeliveryNo(Unknown Source)
查閱資料后發(fā)現(xiàn),是mysql不支持在流式查詢(xún)過(guò)程中使用同一連接再次查詢(xún)數(shù)據(jù)庫(kù)
解決方法:
方法1.使用異步方法查詢(xún),這樣就可以規(guī)避同一個(gè)連接二次查詢(xún)的問(wèn)題
方法2.需要二次查詢(xún)時(shí)開(kāi)啟一個(gè)新的事務(wù)去查詢(xún)就可以,spring中可以使用事務(wù)注解開(kāi)啟新的事務(wù)就搞定了,注解如下:
@Transactional(propagation = Propagation.REQUIRES_NEW,readOnly = true)
到此這篇關(guān)于使用java實(shí)現(xiàn)百萬(wàn)級(jí)別數(shù)據(jù)導(dǎo)出excel的三種方式的文章就介紹到這了,更多相關(guān)java百萬(wàn)級(jí)別數(shù)據(jù)導(dǎo)出excel內(nèi)容請(qǐng)搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家!
相關(guān)文章
Java數(shù)據(jù)結(jié)構(gòu)之LinkedList的用法詳解
鏈表(Linked?list)是一種常見(jiàn)的基礎(chǔ)數(shù)據(jù)結(jié)構(gòu),是一種線(xiàn)性表。Java的LinkedList(鏈表)?類(lèi)似于?ArrayList,是一種常用的數(shù)據(jù)容器,本文就來(lái)簡(jiǎn)單講講它的使用吧2023-05-05解決mybatisplus插入報(bào)錯(cuò)argument type mismatch的問(wèn)題
這篇文章主要介紹了解決mybatisplus插入報(bào)錯(cuò)argument type mismatch的問(wèn)題,具有很好的參考價(jià)值,希望對(duì)大家有所幫助。一起跟隨小編過(guò)來(lái)看看吧2020-11-11struts2.3.24+spring4.1.6+hibernate4.3.11+mysql5.5.25開(kāi)發(fā)環(huán)境搭建圖文
這篇文章主要介紹了struts2.3.24+spring4.1.6+hibernate4.3.11+mysql5.5.25開(kāi)發(fā)環(huán)境搭建圖文教程,感興趣的小伙伴們可以參考一下2016-06-06spring?boot?executable?jar/war?原理解析
spring boot里其實(shí)不僅可以直接以 java -jar demo.jar的方式啟動(dòng),還可以把jar/war變?yōu)橐粋€(gè)可以執(zhí)行的腳本來(lái)啟動(dòng),比如./demo.jar,這篇文章主要介紹了spring?boot?executable?jar/war?原理,需要的朋友可以參考下2023-02-02SpringSecurityOAuth2實(shí)現(xiàn)微信授權(quán)登錄
微信的登錄功能是用戶(hù)注冊(cè)和使用微信的必經(jīng)之路之一,而微信授權(quán)登錄更是方便了用戶(hù)的登錄操作,本文主要介紹了SpringSecurityOAuth2實(shí)現(xiàn)微信授權(quán)登錄,感興趣的可以了解一下2023-09-09MyBatis標(biāo)簽之Select?resultType和resultMap詳解
這篇文章主要介紹了MyBatis標(biāo)簽之Select?resultType和resultMap,在MyBatis中有一個(gè)ResultMap標(biāo)簽,它是為了映射select標(biāo)簽查詢(xún)出來(lái)的結(jié)果集,下面使用一個(gè)簡(jiǎn)單的例子,來(lái)介紹 resultMap 的使用方法,需要的朋友可以參考下2022-09-09springboot配置templates直接訪問(wèn)的實(shí)現(xiàn)
這篇文章主要介紹了springboot配置templates直接訪問(wèn)的實(shí)現(xiàn)方式,具有很好的參考價(jià)值,希望對(duì)大家有所幫助。如有錯(cuò)誤或未考慮完全的地方,望不吝賜教2021-12-12教你如何使用google.zxing結(jié)合springboot生成二維碼功能
這篇文章主要介紹了使用google.zxing結(jié)合springboot生成二維碼功能,我們使用兩種方式,去生成二維碼,但是其實(shí),二維碼的生成基礎(chǔ),都是zxing包,這是Google開(kāi)源的一個(gè)包,本文通過(guò)實(shí)例代碼給大家介紹的非常詳細(xì),需要的朋友可以參考下2022-05-05