快捷導(dǎo)航

Python 多進(jìn)程、多線程效率對(duì)比

更新時(shí)間：2020年11月19日 10:32:14 作者：massquantity

這篇文章主要介紹了Python 多進(jìn)程、多線程的效率對(duì)比，幫助大家選擇適合的技術(shù)，感興趣的朋友可以了解下

Python 界有條不成文的準(zhǔn)則：計(jì)算密集型任務(wù)適合多進(jìn)程，IO 密集型任務(wù)適合多線程。本篇來作個(gè)比較。

通常來說多線程相對(duì)于多進(jìn)程有優(yōu)勢(shì)，因?yàn)閯?chuàng)建一個(gè)進(jìn)程開銷比較大，然而因?yàn)樵?python 中有 GIL 這把大鎖的存在，導(dǎo)致執(zhí)行計(jì)算密集型任務(wù)時(shí)多線程實(shí)際只能是單線程。而且由于線程之間切換的開銷導(dǎo)致多線程往往比實(shí)際的單線程還要慢，所以在 python 中計(jì)算密集型任務(wù)通常使用多進(jìn)程，因?yàn)楦鱾€(gè)進(jìn)程有各自獨(dú)立的 GIL，互不干擾。

而在 IO 密集型任務(wù)中，CPU 時(shí)常處于等待狀態(tài)，操作系統(tǒng)需要頻繁與外界環(huán)境進(jìn)行交互，如讀寫文件，在網(wǎng)絡(luò)間通信等。在這期間 GIL 會(huì)被釋放，因而就可以使用真正的多線程。

以上是理論，下面做一個(gè)簡單的模擬測(cè)試：大量計(jì)算用 math.sin() + math.cos() 來代替，IO 密集型用 time.sleep() 來模擬。在 Python 中有多種方式可以實(shí)現(xiàn)多進(jìn)程和多線程，這里一并納入看看是否有效率差異：

多進(jìn)程： joblib.multiprocessing, multiprocessing.Pool, multiprocessing.apply_async, concurrent.futures.ProcessPoolExecutor
多線程： joblib.threading, threading.Thread, concurrent.futures.ThreadPoolExecutor

from multiprocessing import Pool
from threading import Thread
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
import time, os, math
from joblib import Parallel, delayed, parallel_backend


def f_IO(a): # IO 密集型
 time.sleep(5)

def f_compute(a): # 計(jì)算密集型
 for _ in range(int(1e7)):
  math.sin(40) + math.cos(40)
 return

def normal(sub_f):
 for i in range(6):
  sub_f(i)
 return

def joblib_process(sub_f):
 with parallel_backend("multiprocessing", n_jobs=6):
  res = Parallel()(delayed(sub_f)(j) for j in range(6))
 return


def joblib_thread(sub_f):
 with parallel_backend('threading', n_jobs=6):
  res = Parallel()(delayed(sub_f)(j) for j in range(6))
 return

def mp(sub_f):
 with Pool(processes=6) as p:
  res = p.map(sub_f, list(range(6)))
 return

def asy(sub_f):
 with Pool(processes=6) as p:
  result = []
  for j in range(6):
   a = p.apply_async(sub_f, args=(j,))
   result.append(a)
  res = [j.get() for j in result]

def thread(sub_f):
 threads = []
 for j in range(6):
  t = Thread(target=sub_f, args=(j,))
  threads.append(t)
  t.start()
 for t in threads:
  t.join()

def thread_pool(sub_f):
 with ThreadPoolExecutor(max_workers=6) as executor:
  res = [executor.submit(sub_f, j) for j in range(6)]

def process_pool(sub_f):
 with ProcessPoolExecutor(max_workers=6) as executor:
  res = executor.map(sub_f, list(range(6)))

def showtime(f, sub_f, name):
 start_time = time.time()
 f(sub_f)
 print("{} time: {:.4f}s".format(name, time.time() - start_time))

def main(sub_f):
 showtime(normal, sub_f, "normal")
 print()
 print("------ 多進(jìn)程 ------")
 showtime(joblib_process, sub_f, "joblib multiprocess")
 showtime(mp, sub_f, "pool")
 showtime(asy, sub_f, "async")
 showtime(process_pool, sub_f, "process_pool")
 print()
 print("----- 多線程 -----")
 showtime(joblib_thread, sub_f, "joblib thread")
 showtime(thread, sub_f, "thread")
 showtime(thread_pool, sub_f, "thread_pool")


if __name__ == "__main__":
 print("----- 計(jì)算密集型 -----")
 sub_f = f_compute
 main(sub_f)
 print()
 print("----- IO 密集型 -----")
 sub_f = f_IO
 main(sub_f)

結(jié)果：

----- 計(jì)算密集型 -----
normal time: 15.1212s

------ 多進(jìn)程 ------
joblib multiprocess time: 8.2421s
pool time: 8.5439s
async time: 8.3229s
process_pool time: 8.1722s

----- 多線程 -----
joblib thread time: 21.5191s
thread time: 21.3865s
thread_pool time: 22.5104s



----- IO 密集型 -----
normal time: 30.0305s

------ 多進(jìn)程 ------
joblib multiprocess time: 5.0345s
pool time: 5.0188s
async time: 5.0256s
process_pool time: 5.0263s

----- 多線程 -----
joblib thread time: 5.0142s
thread time: 5.0055s
thread_pool time: 5.0064s

上面每一方法都統(tǒng)一創(chuàng)建6個(gè)進(jìn)程/線程，結(jié)果是計(jì)算密集型任務(wù)中速度：多進(jìn)程 > 單進(jìn)程/線程 > 多線程， IO 密集型任務(wù)速度：多線程 > 多進(jìn)程 > 單進(jìn)程/線程。

以上就是Python 多進(jìn)程、多線程效率比較的詳細(xì)內(nèi)容，更多關(guān)于Python 多進(jìn)程、多線程的資料請(qǐng)關(guān)注腳本之家其它相關(guān)文章！

您可能感興趣的文章: