快捷導(dǎo)航

Python 迭代器工具包【推薦】

更新時(shí)間：2016年05月06日 11:17:38 投稿：wulei

迭代器工具在產(chǎn)生數(shù)據(jù)的時(shí)候?qū)?huì)顯得非常便捷、高效，掌握了這些基本的方法之后，通過(guò)簡(jiǎn)單的組合就可以獲得更多迭代器工具。

　　原文：https://git.io/pytips

　　0x01 介紹了迭代器的概念，即定義了 __iter__() 和 __next__() 方法的對(duì)象，或者通過(guò) yield 簡(jiǎn)化定義的“可迭代對(duì)象”，而在一些函數(shù)式編程語(yǔ)言(見(jiàn) 0x02 Python 中的函數(shù)式編程)中，類似的迭代器常被用于產(chǎn)生特定格式的列表(或序列)，這時(shí)的迭代器更像是一種數(shù)據(jù)結(jié)構(gòu)而非函數(shù)(當(dāng)然在一些函數(shù)式編程語(yǔ)言中，這兩者并無(wú)本質(zhì)差異)。Python 借鑒了 APL, Haskell, and SML 中的某些迭代器的構(gòu)造方法，并在 itertools 中實(shí)現(xiàn)(該模塊是通過(guò) C 實(shí)現(xiàn)，源代碼：/Modules/itertoolsmodule.c)。

　　itertools 模塊提供了如下三類迭代器構(gòu)建工具：

　　無(wú)限迭代

　　整合兩序列迭代

　　組合生成器

　　1. 無(wú)限迭代

　　所謂無(wú)限(infinite)是指如果你通過(guò) for...in... 的語(yǔ)法對(duì)其進(jìn)行迭代，將陷入無(wú)限循環(huán)，包括：

count(start, [step])

　　cycle(p)

　　repeat(elem [,n])

　　從名字大概可以猜出它們的用法，既然說(shuō)是無(wú)限迭代，我們自然不會(huì)想要將其所有元素依次迭代取出，而通常是結(jié)合 map/zip 等方法，將其作為一個(gè)取之不盡的數(shù)據(jù)倉(cāng)庫(kù)，與有限長(zhǎng)度的可迭代對(duì)象進(jìn)行組合操作：

from itertools import cycle, count, repeat
print(count.__doc__)
　　count(start=0, step=1) --> count object
　　Return a count object whose .__next__() method returns consecutive values.
　　Equivalent to:
　　def count(firstval=0, step=1):
　　x = firstval
　　while 1:
　　yield x
　　x += step
　　counter = count()
　　print(next(counter))
  print(next(counter))
　　print(list(map(lambda x, y: x+y, range(10), counter)))
　　odd_counter = map(lambda x: 'Odd#{}'.format(x), count(1, 2))
  print(next(odd_counter))
　　print(next(odd_counter))

　　0

　　1

　　[2, 4, 6, 8, 10, 12, 14, 16, 18, 20]

　　Odd#1

　　Odd#3

　　print(cycle.__doc__)

　　cycle(iterable) --> cycle object

　　Return elements from the iterable until it is exhausted.

　　Then repeat the sequence indefinitely.

　　cyc = cycle(range(5))

　　print(list(zip(range(6), cyc)))

　　print(next(cyc))

　　print(next(cyc))

　　[(0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (5, 0)]

　　1

　　2

　　print(repeat.__doc__)

　　repeat(object [,times]) -> create an iterator which returns the object

　　for the specified number of times. If not specified, returns the object

　　endlessly.

　　print(list(repeat('Py', 3)))

　　rep = repeat('p')

　　print(list(zip(rep, 'y'*3)))

　　['Py', 'Py', 'Py']

　　[('p', 'y'), ('p', 'y'), ('p', 'y')]

　　2. 整合兩序列迭代

　　所謂整合兩序列，是指以兩個(gè)有限序列為輸入，將其整合操作之后返回為一個(gè)迭代器，最為常見(jiàn)的 zip 函數(shù)就屬于這一類別，只不過(guò) zip 是內(nèi)置函數(shù)。這一類別完整的方法包括：

　accumulate()

　　chain()/chain.from_iterable()

　　compress()

　　dropwhile()/filterfalse()/takewhile()

　　groupby()

　　islice()

　　starmap()

　　tee()

　　zip_longest()

　　這里就不對(duì)所有的方法一一舉例說(shuō)明了，如果想要知道某個(gè)方法的用法，基本通過(guò) print(method.__doc__) 就可以了解，畢竟 itertools 模塊只是提供了一種快捷方式，并沒(méi)有隱含什么深?yuàn)W的算法。這里只對(duì)下面幾個(gè)我覺(jué)得比較有趣的方法進(jìn)行舉例說(shuō)明。

from itertools import cycle, compress, islice, takewhile, count

　　# 這三個(gè)方法(如果使用恰當(dāng))可以限定無(wú)限迭代

　　# print(compress.__doc__)

　　print(list(compress(cycle('PY'), [1, 0, 1, 0])))

　　# 像操作列表 l[start:stop:step] 一樣操作其它序列

　　# print(islice.__doc__)

　　print(list(islice(cycle('PY'), 0, 2)))

　　# 限制版的 filter

　　# print(takewhile.__doc__)

　　print(list(takewhile(lambda x: x < 5, count())))

　　['P', 'P']

　　['P', 'Y']

　　[0, 1, 2, 3, 4]

　　from itertools import groupby

　　from operator import itemgetter

　　print(groupby.__doc__)

　　for k, g in groupby('AABBC'):

　　print(k, list(g))

　　db = [dict(name='python', script=True),

　　dict(name='c', script=False),

　　dict(name='c++', script=False),

　　dict(name='ruby', script=True)]

　　keyfunc = itemgetter('script')

　　db2 = sorted(db, key=keyfunc) # sorted by `script'

　　for isScript, langs in groupby(db2, keyfunc):

　　print(', '.join(map(itemgetter('name'), langs)))

　　groupby(iterable[, keyfunc]) -> create an iterator which returns

　　(key, sub-iterator) grouped by each value of key(value).

　　A ['A', 'A']

　　B ['B', 'B']

　　C ['C']

　　c, c++

　　python, ruby

　　from itertools import zip_longest

　　# 內(nèi)置函數(shù) zip 以較短序列為基準(zhǔn)進(jìn)行合并，

　　# zip_longest 則以最長(zhǎng)序列為基準(zhǔn)，并提供補(bǔ)足參數(shù) fillvalue

　　# Python 2.7 中名為 izip_longest

　　print(list(zip_longest('ABCD', '123', fillvalue=0)))

　　[('A', '1'), ('B', '2'), ('C', '3'), ('D', 0)]

　　3. 組合生成器

　　關(guān)于生成器的排列組合：　

product(*iterables, repeat=1)：兩輸入序列的笛卡爾乘積

　　permutations(iterable, r=None)：對(duì)輸入序列的完全排列組合

　　combinations(iterable, r)：有序版的排列組合

　　combinations_with_replacement(iterable, r)：有序版的笛卡爾乘積

　　from itertools import product, permutations, combinations, combinations_with_replacement

　　print(list(product(range(2), range(2))))

　　print(list(product('AB', repeat=2)))

　　[(0, 0), (0, 1), (1, 0), (1, 1)]

　　[('A', 'A'), ('A', 'B'), ('B', 'A'), ('B', 'B')]

　　print(list(combinations_with_replacement('AB', 2)))

　　[('A', 'A'), ('A', 'B'), ('B', 'B')]

　　# 賽馬問(wèn)題：4匹馬前2名的排列組合(A^4_2)

　　print(list(permutations('ABCDE', 2)))

　　[('A', 'B'), ('A', 'C'), ('A', 'D'), 
 ('A', 'E'), ('B', 'A'), ('B', 'C'), 
 ('B', 'D'), ('B', 'E'), ('C', 'A'), 
 ('C', 'B'), ('C', 'D'), ('C', 'E'), 
 ('D', 'A'), ('D', 'B'), ('D', 'C'), 
 ('D', 'E'), ('E', 'A'), ('E', 'B'), ('E', 'C'), ('E', 'D')]

　　# 彩球問(wèn)題：4種顏色的球任意抽出2個(gè)的顏色組合(C^4_2)

　　print(list(combinations('ABCD', 2)))

　　[('A', 'B'), ('A', 'C'), ('A', 'D'), ('B', 'C'), ('B', 'D'), ('C', 'D')]

您可能感興趣的文章: