快捷導(dǎo)航

Java?多線程并發(fā)編程提高數(shù)據(jù)處理效率的詳細(xì)過程

更新時間：2023年04月04日 15:02:50 作者：ReadThroughLife

這篇文章主要介紹了Java?多線程并發(fā)編程提高數(shù)據(jù)處理效率,本文給大家介紹的非常詳細(xì)，對大家的學(xué)習(xí)或工作具有一定的參考借鑒價值，需要的朋友可以參考下

??工作場景中遇到這樣一個需求：根據(jù)主機(jī)的 IP 地址聯(lián)動更新其他模型的相關(guān)信息。需求很簡單，只涉及一般的數(shù)據(jù)庫聯(lián)動查詢以及更新操作，然而在編碼實(shí)現(xiàn)過程中發(fā)現(xiàn)，由于主機(jī)的數(shù)量很多，導(dǎo)致循環(huán)遍歷查詢、更新時花費(fèi)很長的時間，調(diào)用一次接口大概需要 30-40 min 時間才能完成操作。

??因此，為了有效縮短接口方法的執(zhí)行時間，便考慮使用多線程并發(fā)編程方法，利用多核處理器并行執(zhí)行的能力，通過異步處理數(shù)據(jù)的方式，便可以大大縮短執(zhí)行時間，提高執(zhí)行效率。

??這里使用可重用固定線程數(shù)的線程池 FixedThreadPool，并使用 CountDownLatch 并發(fā)工具類提供的并發(fā)流程控制工具作為配合使用，保證多線程并發(fā)編程過程中的正常運(yùn)行：

首先，通過 Runtime.getRuntime().availableProcessors() 方法獲取運(yùn)行機(jī)器的 CPU 線程數(shù)，用于后續(xù)設(shè)置固定線程池的線程數(shù)量。
其次，判斷任務(wù)的特性，如果為計算密集型任務(wù)則設(shè)置線程數(shù)為 CPU 線程數(shù)+1，如果為 IO 密集型任務(wù)則設(shè)置線程數(shù)為 2 * CPU 線程數(shù)，由于在方法中需要與數(shù)據(jù)庫進(jìn)行頻繁的交互，因此屬于 IO 密集型任務(wù)。
之后，對數(shù)據(jù)進(jìn)行分組切割，每個線程處理一個分組的數(shù)據(jù)，分組的組數(shù)與線程數(shù)保持一致，并且還要創(chuàng)建計數(shù)器對象 CountDownLatch，調(diào)用構(gòu)造函數(shù)，初始化參數(shù)值為線程數(shù)個數(shù)，保證主線程等待所有子線程運(yùn)行結(jié)束后，再進(jìn)行后續(xù)的操作。
然后，調(diào)用 executorService.execute() 方法，重寫 run 方法編寫業(yè)務(wù)邏輯與數(shù)據(jù)處理代碼，執(zhí)行完當(dāng)前線程后記得將計數(shù)器減1操作。
最后，當(dāng)所有子線程執(zhí)行完成后，關(guān)閉線程池。

?在省略工作場景中的業(yè)務(wù)邏輯代碼后，通用的處理方法示例如下所示：

public ResponseData updateHostDept() {
		// ...
		List<Map> hostMapList = mongoTemplate.find(query, Map.class, "host");
        // split the hostMapList for the following multi-threads task
        // return the number of logical CPUs
        int processorsNum = Runtime.getRuntime().availableProcessors();
        // set the threadNum as 2*(the number of logical CPUs) for handling IO Tasks,
        // if Computing Tasks set the threadNum as (the number of logical  CPUs) + 1
        int threadNum = processorsNum * 2;  
        // the number of each group data 
        int eachGroupNum = hostMapList.size() / threadNum; 
        List<List<Map>> groupList = new ArrayList<>();
        for (int i = 0; i < threadNum; i++) {
            int start = i * eachGroupNum;
            if (i == threadNum - 1) {
                int end = mapList.size();
                groupList.add(hostMapList.subList(start, end));
            } else {
                int end = (i+1) * eachGroupNum;
                groupList.add(hostMapList.subList(start, end));
            }
        }
        // update data by using multi-threads asynchronously
        ExecutorService executorService = Executors.newFixedThreadPool(threadNum/2);
        CountDownLatch countDownLatch = new CountDownLatch(threadNum);
        for (List<Map> group : groupList) {
            executorService.execute(()->{
                try {
                    for (Map map : group) {
                    	// update the data in mongodb
                    }
                } catch (Exception e) {
                    e.printStackTrace();
                } finally {
                	// let counter minus one 
                    countDownLatch.countDown();  
                }
            });
        }
        try {
        	// main thread donnot execute until all child threads finish
            countDownLatch.await();  
        } catch (Exception e) {
            e.printStackTrace();
        }
        // remember to shutdown the threadPool
        executorService.shutdown();  
        return ResponseData.success();
}

??那么在使用多線程異步更新的策略后，從當(dāng)初調(diào)用接口所需的大概時間為 30-40 min 下降到了 8-10 min，大大提高了執(zhí)行效率。

??需要注意的是，這里使用的 newFixedThreadPool 創(chuàng)建線程池，它有一個缺陷就是，它的阻塞隊列默認(rèn)是一個無界隊列，默認(rèn)值為 Integer.MAX_VALUE 極有可能會造成 OOM 問題。因此，一般可以使用 ThreadPoolExecutor 來創(chuàng)建線程池，自己可以指定等待隊列中的線程個數(shù)，避免產(chǎn)生 OOM 問題。

public ResponseData updateHostDept() {
		// ...
		List<Map> hostMapList = mongoTemplate.find(query, Map.class, "host");
        // split the hostMapList for the following multi-threads task
        // return the number of logical CPUs
        int processorsNum = Runtime.getRuntime().availableProcessors();
        // set the threadNum as 2*(the number of logical CPUs) for handling IO Tasks,
        // if Computing Tasks set the threadNum as (the number of logical  CPUs) + 1
        int threadNum = processorsNum * 2;  
        // the number of each group data 
        int eachGroupNum = hostMapList.size() / threadNum; 
        List<List<Map>> groupList = new ArrayList<>();
        for (int i = 0; i < threadNum; i++) {
            int start = i * eachGroupNum;
            if (i == threadNum - 1) {
                int end = mapList.size();
                groupList.add(hostMapList.subList(start, end));
            } else {
                int end = (i+1) * eachGroupNum;
                groupList.add(hostMapList.subList(start, end));
            }
        }
        // update data by using multi-threads asynchronously
        ThreadPoolExecutor executor = new ThreadPoolExecutor(5, 8, 30L, TimeUnit.SECONDS, 
                new ArrayBlockingQueue<>(100));
        CountDownLatch countDownLatch = new CountDownLatch(threadNum);
        for (List<Map> group : groupList) {
            executor.execute(()->{
                try {
                    for (Map map : group) {
                    	// update the data in mongodb
                    }
                } catch (Exception e) {
                    e.printStackTrace();
                } finally {
                	// let counter minus one 
                    countDownLatch.countDown();  
                }
            });
        }
        try {
        	// main thread donnot execute until all child threads finish
            countDownLatch.await();  
        } catch (Exception e) {
            e.printStackTrace();
        }
        // remember to shutdown the threadPool
        executor.shutdown();  
        return ResponseData.success();
}

在上述的代碼中，核心線程數(shù)和最大線程數(shù)分別為 5 和 8，并沒有設(shè)置的很大的值，因?yàn)槿绻绻O(shè)置的很大，線程間頻繁的上下文切換也會增加時間消耗，反而不能最大程度上發(fā)揮多線程的優(yōu)勢。至于如何選擇合適的參數(shù)，需要根據(jù)機(jī)器的參數(shù)以及任務(wù)的類型綜合考慮決定。

??最后補(bǔ)充一點(diǎn)，如果想要通過非編碼的方式獲取機(jī)器的 CPU 線程個數(shù)也很簡單，windows 系統(tǒng)通過任務(wù)管理器，選擇 “性能”，便可以查看 CPU 線程個數(shù)的情況，如下圖所示：