Python異步庫asyncio、aiohttp詳解
asyncio
版本支持
- asyncio 模塊在 Python3.4 時(shí)發(fā)布。
- async 和 await 關(guān)鍵字最早在 Python3.5 中引入。
- Python3.3 之前不支持。
關(guān)鍵概念
event_loop
事件循環(huán):程序開啟一個(gè)無限的循環(huán),程序員會(huì)把一些函數(shù)(協(xié)程)注冊到事件循環(huán)上。當(dāng)滿足事件發(fā)生的時(shí)候,調(diào)用相應(yīng)的協(xié)程函數(shù)。coroutine
協(xié)程:協(xié)程對象,指一個(gè)使用async關(guān)鍵字定義的函數(shù),它的調(diào)用不會(huì)立即執(zhí)行函數(shù),而是會(huì)返回一個(gè)協(xié)程對象。協(xié)程對象需要注冊到事件循環(huán),由事件循環(huán)調(diào)用。future
對象: 代表將來執(zhí)行或沒有執(zhí)行的任務(wù)的結(jié)果。它和task上沒有本質(zhì)的區(qū)別task
任務(wù):一個(gè)協(xié)程對象就是一個(gè)原生可以掛起的函數(shù),任務(wù)則是對協(xié)程進(jìn)一步封裝,其中包含任務(wù)的各種狀態(tài)。Task 對象是 Future 的子類,它將 coroutine 和 Future 聯(lián)系在一起,將 coroutine 封裝成一個(gè) Future 對象。async/await
關(guān)鍵字:python3.5 用于定義協(xié)程的關(guān)鍵字,async定義一個(gè)協(xié)程,await用于掛起阻塞的異步調(diào)用接口。其作用在一定程度上類似于yield。
工作流程
- 定義/創(chuàng)建協(xié)程對象
- 將協(xié)程轉(zhuǎn)為task任務(wù)
- 定義事件循環(huán)對象容器
- 將task任務(wù)放到事件循環(huán)對象中觸發(fā)
import asyncio async def hello(name): print('Hello,', name) # 定義協(xié)程對象 coroutine = hello("World") # 定義事件循環(huán)對象容器 loop = asyncio.get_event_loop() # 將協(xié)程轉(zhuǎn)為task任務(wù) # task = asyncio.ensure_future(coroutine) task = loop.create_task(coroutine) # 將task任務(wù)扔進(jìn)事件循環(huán)對象中并觸發(fā) loop.run_until_complete(task)
并發(fā)
1. 創(chuàng)建多個(gè)協(xié)程的列表 tasks:
import asyncio async def do_some_work(x): print('Waiting: ', x) await asyncio.sleep(x) return 'Done after {}s'.format(x) tasks = [do_some_work(1), do_some_work(2), do_some_work(4)]
2. 將協(xié)程注冊到事件循環(huán)中:
- 方法一:使用
asyncio.wait()
loop = asyncio.get_event_loop() loop.run_until_complete(asyncio.wait(tasks))
- 方法二:使用
asyncio.gather()
loop = asyncio.get_event_loop() loop.run_until_complete(asyncio.gather(*tasks))
3. 查看 return 結(jié)果:
for task in tasks: print('Task ret: ', task.result())
4. asyncio.wait()
與 asyncio.gather()
的區(qū)別:
接收參數(shù)不同:
asyncio.wait()
:必須是一個(gè) list 對象,list 對象里存放多個(gè) task 任務(wù)。
# 使用 asyncio.ensure_future 轉(zhuǎn)換為 task 對象 tasks=[ asyncio.ensure_future(factorial("A", 2)), asyncio.ensure_future(factorial("B", 3)), asyncio.ensure_future(factorial("C", 4)) ] # 也可以不轉(zhuǎn)為 task 對象 # tasks=[ # factorial("A", 2), # factorial("B", 3), # factorial("C", 4) # ] loop = asyncio.get_event_loop() loop.run_until_complete(asyncio.wait(tasks))
asyncio.gather()
:比較廣泛,注意接收 list 對象時(shí)*
不能省略。
tasks=[ asyncio.ensure_future(factorial("A", 2)), asyncio.ensure_future(factorial("B", 3)), asyncio.ensure_future(factorial("C", 4)) ] # tasks=[ # factorial("A", 2), # factorial("B", 3), # factorial("C", 4) # ] loop = asyncio.get_event_loop() loop.run_until_complete(asyncio.gather(*tasks))
loop = asyncio.get_event_loop() group1 = asyncio.gather(*[factorial("A" ,i) for i in range(1, 3)]) group2 = asyncio.gather(*[factorial("B", i) for i in range(1, 5)]) group3 = asyncio.gather(*[factorial("B", i) for i in range(1, 7)]) loop.run_until_complete(asyncio.gather(group1, group2, group3))
返回結(jié)果不同:
asyncio.wait()
:返回dones
(已完成任務(wù)) 和pendings
(未完成任務(wù))
dones, pendings = await asyncio.wait(tasks) for task in dones: print('Task ret: ', task.result())
asyncio.gather()
:直接返回結(jié)果
results = await asyncio.gather(*tasks) for result in results: print('Task ret: ', result)
aiohttp
ClientSession 會(huì)話管理
import aiohttp import asyncio async def main(): async with aiohttp.ClientSession() as session: async with session.get('http://httpbin.org/get') as resp: print(resp.status) print(await resp.text()) asyncio.run(main())
其他請求:
session.post('http://httpbin.org/post', data=b'data') session.put('http://httpbin.org/put', data=b'data') session.delete('http://httpbin.org/delete') session.head('http://httpbin.org/get') session.options('http://httpbin.org/get') session.patch('http://httpbin.org/patch', data=b'data')
URL 參數(shù)傳遞
async def main(): async with aiohttp.ClientSession() as session: params = {'key1': 'value1', 'key2': 'value2'} async with session.get('http://httpbin.org/get', params=params) as r: expect = 'http://httpbin.org/get?key1=value1&key2=value2' assert str(r.url) == expect
async def main(): async with aiohttp.ClientSession() as session: params = [('key', 'value1'), ('key', 'value2')] async with session.get('http://httpbin.org/get', params=params) as r: expect = 'http://httpbin.org/get?key=value2&key=value1' assert str(r.url) == expect
獲取響應(yīng)內(nèi)容
async def main(): async with aiohttp.ClientSession() as session: async with session.get('http://httpbin.org/get') as r: # 狀態(tài)碼 print(r.status) # 響應(yīng)內(nèi)容,可以自定義編碼 print(await r.text(encoding='utf-8')) # 非文本內(nèi)容 print(await r.read()) # JSON 內(nèi)容 print(await r.json())
自定義請求頭
headers = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.106 Safari/537.36" } async def main(): async with aiohttp.ClientSession() as session: async with session.get('http://httpbin.org/get', headers=headers) as r: print(r.status)
為所有會(huì)話設(shè)置請求頭:
headers = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.106 Safari/537.36" } async def main(): async with aiohttp.ClientSession(headers=headers) as session: async with session.get('http://httpbin.org/get') as r: print(r.status)
自定義 cookies
async def main(): cookies = {'cookies_are': 'working'} async with aiohttp.ClientSession() as session: async with session.get('http://httpbin.org/cookies', cookies=cookies) as resp: assert await resp.json() == {"cookies": {"cookies_are": "working"}}
為所有會(huì)話設(shè)置 cookies:
async def main(): cookies = {'cookies_are': 'working'} async with aiohttp.ClientSession(cookies=cookies) as session: async with session.get('http://httpbin.org/cookies') as resp: assert await resp.json() == {"cookies": {"cookies_are": "working"}}
設(shè)置代理
注意:只支持 http 代理。
async def main(): async with aiohttp.ClientSession() as session: proxy = "http://127.0.0.1:1080" async with session.get("http://python.org", proxy=proxy) as r: print(r.status)
需要用戶名密碼授權(quán)的代理:
async def main(): async with aiohttp.ClientSession() as session: proxy = "http://127.0.0.1:1080" proxy_auth = aiohttp.BasicAuth('username', 'password') async with session.get("http://python.org", proxy=proxy, proxy_auth=proxy_auth) as r: print(r.status)
也可以直接傳遞:
async def main(): async with aiohttp.ClientSession() as session: proxy = "http://username:password@127.0.0.1:1080" async with session.get("http://python.org", proxy=proxy) as r: print(r.status)
異步爬蟲示例
import asyncio import aiohttp from lxml import etree from datetime import datetime headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.106 Safari/537.36"} async def get_movie_url(): req_url = "https://movie.douban.com/chart" async with aiohttp.ClientSession() as session: async with session.get(url=req_url, headers=headers) as response: result = await response.text() result = etree.HTML(result) return result.xpath("http://*[@id='content']/div/div[1]/div/div/table/tr/td/a/@href") async def get_movie_content(movie_url): async with aiohttp.ClientSession() as session: async with session.get(url=movie_url, headers=headers) as response: result = await response.text() result = etree.HTML(result) movie = dict() name = result.xpath('//*[@id="content"]/h1/span[1]//text()') author = result.xpath('//*[@id="info"]/span[1]/span[2]//text()') movie["name"] = name movie["author"] = author return movie def run(): start = datetime.now() loop = asyncio.get_event_loop() movie_url_list = loop.run_until_complete(get_movie_url()) tasks = [get_movie_content(url) for url in movie_url_list] movies = loop.run_until_complete(asyncio.gather(*tasks)) print(movies) print("異步用時(shí)為:{}".format(datetime.now() - start)) if __name__ == '__main__': run()
總結(jié)
以上為個(gè)人經(jīng)驗(yàn),希望能給大家一個(gè)參考,也希望大家多多支持腳本之家。
- Python使用asyncio實(shí)現(xiàn)異步操作的示例
- Python中asyncio的多種用法舉例(異步同步)
- Python使用asyncio處理異步編程的代碼示例
- Python使用asyncio包實(shí)現(xiàn)異步編程方式
- python協(xié)程異步IO中asyncio的使用
- Python使用asyncio標(biāo)準(zhǔn)庫對異步IO的支持
- Python協(xié)程異步爬取數(shù)據(jù)(asyncio+aiohttp)實(shí)例
- Python使用asyncio異步時(shí)的常見問題總結(jié)
- Python asyncio異步編程常見問題小結(jié)
- Python asyncio異步編程簡單實(shí)現(xiàn)示例
- Python中asyncio庫實(shí)現(xiàn)異步編程的示例
相關(guān)文章
python-opencv-cv2.threshold()二值化函數(shù)的使用
這篇文章主要介紹了python-opencv-cv2.threshold()二值化函數(shù)的使用,具有很好的參考價(jià)值,希望對大家有所幫助。如有錯(cuò)誤或未考慮完全的地方,望不吝賜教2022-11-11Python cookbook(數(shù)據(jù)結(jié)構(gòu)與算法)從字典中提取子集的方法示例
這篇文章主要介紹了Python cookbook(數(shù)據(jù)結(jié)構(gòu)與算法)從字典中提取子集的方法,涉及Python字典推導(dǎo)式的相關(guān)使用技巧,需要的朋友可以參考下2018-03-03Python語法學(xué)習(xí)之線程的創(chuàng)建與常用方法詳解
本文主要介紹了線程的使用,線程是利用進(jìn)程的資源來執(zhí)行業(yè)務(wù),并且通過創(chuàng)建多個(gè)線程,對于資源的消耗相對來說會(huì)比較低,今天就來看一看線程的使用方法具體有哪些吧2022-04-04Pytorch建模過程中的DataLoader與Dataset示例詳解
這篇文章主要介紹了Pytorch建模過程中的DataLoader與Dataset,同時(shí)PyTorch針對不同的專業(yè)領(lǐng)域,也提供有不同的模塊,例如?TorchText,?TorchVision,?TorchAudio,這些模塊中也都包含一些真實(shí)數(shù)據(jù)集示例,本文給大家介紹的非常詳細(xì),需要的朋友參考下吧2023-01-01python實(shí)現(xiàn)獲取aws route53域名信息的方法
最近由于工作原因接觸到aws的服務(wù),我需要實(shí)時(shí)獲取所有的域名信息,用于對其進(jìn)行掃描,因此寫了一個(gè)自動(dòng)化爬取腳本 給需要的人分享,對python獲取aws route53域名信息相關(guān)知識(shí)感興趣的朋友一起看看吧2023-12-12win8.1安裝Python 2.7版環(huán)境圖文詳解
在本篇內(nèi)容里小編給大家分享了關(guān)于win8.1安裝Python 2.7版環(huán)境的詳細(xì)步驟和方法,有興趣的朋友們跟著學(xué)習(xí)下。2019-07-07