快捷導(dǎo)航

python?list與numpy數(shù)組效率對(duì)比

更新時(shí)間：2023年02月01日 08:30:26 作者：強(qiáng)殖裝甲凱普

這篇文章主要介紹了python?list與numpy數(shù)組效率對(duì)比分析，具有很好的參考價(jià)值，希望對(duì)大家有所幫助。如有錯(cuò)誤或未考慮完全的地方，望不吝賜教

前言

因?yàn)榻?jīng)常一訓(xùn)練就是很多次迭代，所以找到效率比較高的操作能大大縮短運(yùn)行時(shí)間，但這方面資料不足，所以自己記錄總結(jié)一下，有需要再補(bǔ)充

索引效率與內(nèi)存占用比較

有時(shí)候我需要一個(gè)數(shù)組，然后可能會(huì)頻繁從中索引數(shù)據(jù)，那么我選擇list還是numpy array呢，這里做了一個(gè)簡(jiǎn)單的實(shí)驗(yàn)進(jìn)行比較，環(huán)境python 3.6

import random
import numpy as np
import time
import sys
# import matplotlib
# matplotlib.use('agg')
import matplotlib.pyplot as plt
from collections import deque

start = time.time()
length = []

list_size = []
array_size = []
deque_size = []

list_time = []
array_time = []
deque_time = []

for l in range(5, 15000, 5):
    print(l)
    length.append(l)
    a = [1] * l
    b = np.array(a)
    c = deque(maxlen=l)
    for i in range(l):
        c.append(1)

    # print('list的size為：{}'.format(sys.getsizeof(a)))
    # print('array的size為：{}'.format(sys.getsizeof(b)))
    # print('deque的size為：{}'.format(sys.getsizeof(c)))
    list_size.append(sys.getsizeof(a))
    array_size.append(sys.getsizeof(b))
    deque_size.append(sys.getsizeof(c))

    for i in range(3):
        if i == 0:
            tmp = a
            name = 'list'
        elif i == 1:
            tmp = b
            name = 'array'
        else:
            tmp = c
            name = 'deque'

        s = time.time()
        for j in range(1000000):
            x = tmp[random.randint(0, len(a)-1)]
        duration = time.time() - s

        if name == 'list':
            list_time.append(duration)
        elif name == 'array':
            array_time.append(duration)
        else:
            deque_time.append(duration)

duration = time.time() - start
time_list = [0, 0, 0]
time_list[0] = duration // 3600
time_list[1] = (duration % 3600) // 60
time_list[2] = round(duration % 60, 2)
print('用時(shí)：' + str(time_list[0]) + ' 時(shí) ' + str(time_list[1]) + '分' + str(time_list[2]) + '秒')

fig = plt.figure()

ax1 = fig.add_subplot(211)
ax1.plot(length, list_size, label='list')
ax1.plot(length, array_size, label='array')
ax1.plot(length, deque_size, label='deque')
plt.xlabel('length')
plt.ylabel('size')
plt.legend()

ax2 = fig.add_subplot(212)
ax2.plot(length, list_time, label='list')
ax2.plot(length, array_time, label='array')
ax2.plot(length, deque_time, label='deque')
plt.xlabel('length')
plt.ylabel('time')
plt.legend()

plt.show()

對(duì)不同大小的list，numpy array和deque進(jìn)行一百萬(wàn)次的索引，結(jié)果為

可以看出，numpy array對(duì)內(nèi)存的優(yōu)化很好，長(zhǎng)度越大，其相比list和deque占用內(nèi)存越少。

list比deque稍微好一點(diǎn)。因此如果對(duì)內(nèi)存占用敏感，選擇優(yōu)先級(jí)：numpy array>>list>deque。

時(shí)間上，在15000以下這個(gè)長(zhǎng)度，list基本都最快。其中

長(zhǎng)度<1000左右時(shí)，deque跟list差不多，選擇優(yōu)先級(jí)：list≈ \approx≈deque>numpy array;
長(zhǎng)度<9000左右，選擇優(yōu)先級(jí)：list>deque>numpy array;
長(zhǎng)度>9000左右，選擇優(yōu)先級(jí)：list>numpy array>deque;

不過(guò)時(shí)間上的差距都不大，幾乎可以忽略，差距主要體現(xiàn)在內(nèi)存占用上。因此如果對(duì)內(nèi)存不敏感，list是最好選擇。

整個(gè)實(shí)驗(yàn)使用i7-9700，耗時(shí)2.0 時(shí) 36.0分20.27秒，如果有人愿意嘗試更大的量級(jí)，更小的間隔，歡迎告知我結(jié)果。

添加效率比較

numpy的數(shù)組沒(méi)有動(dòng)態(tài)改變大小的功能，因此這里numpy數(shù)據(jù)只是對(duì)其進(jìn)行賦值。

import numpy as np
import time
from collections import deque

l = 10000000
a = []
b = np.zeros(l)
c = deque(maxlen=l)
for i in range(3):
    if i == 0:
        tmp = a
        name = 'list'
    elif i == 1:
        tmp = b
        name = 'array'
    else:
        tmp = c
        name = 'deque'

    start = time.time()
    if name == 'array':
        for j in range(l):
            tmp[j] = 1
    else:
        for j in range(l):
            tmp.append(1)
    duration = time.time() - start
    time_list = [0, 0, 0]
    time_list[0] = duration // 3600
    time_list[1] = (duration % 3600) // 60
    time_list[2] = round(duration % 60, 2)
    print(name + '用時(shí)：' + str(time_list[0]) + ' 時(shí) ' + str(time_list[1]) + '分' + str(time_list[2]) + '秒')

結(jié)果為：