Python爬蟲爬取有道實現(xiàn)翻譯功能
準備
首先安裝爬蟲urllib庫
pip install urllib
獲取有道翻譯的鏈接url
需要發(fā)送的參數(shù)在form data里
示例
import urllib.request import urllib.parse url = 'http://fanyi.youdao.com/translate_o?smartresult=dict&smartresult=rule' data = {} data['i'] = 'i love python' data['from'] = 'AUTO' data['to'] = 'AUTO' data['smartresult'] = 'dict' data['client'] = 'fanyideskweb' data['salt'] = '16057996372935' data['sign'] = '0965172abb459f8c7a791df4184bf51c' data['lts'] = '1605799637293' data['bv'] = 'f7d97c24a497388db1420108e6c3537b' data['doctype'] = 'json' data['version'] = '2.1' data['keyfrom'] = 'fanyi.web' data['action'] = 'FY_BY_REALTlME' data = urllib.parse.urlencode(data).encode('utf-8') response = urllib.request.urlopen(url,data) html = response.read().decode('utf-8') print(html)
運行會出現(xiàn)50的錯誤,這里需要將url鏈接的_o刪除掉
刪除后運行成功
但是這個結果看起來還是太復雜,需要在進行優(yōu)化
導入json,然后轉換成字典進行過濾
import urllib.request import urllib.parse import json url = 'http://fanyi.youdao.com/translate?smartresult=dict&smartresult=rule' data = {} data['i'] = 'i love python' data['from'] = 'AUTO' data['to'] = 'AUTO' data['smartresult'] = 'dict' data['client'] = 'fanyideskweb' data['salt'] = '16057996372935' data['sign'] = '0965172abb459f8c7a791df4184bf51c' data['lts'] = '1605799637293' data['bv'] = 'f7d97c24a497388db1420108e6c3537b' data['doctype'] = 'json' data['version'] = '2.1' data['keyfrom'] = 'fanyi.web' data['action'] = 'FY_BY_REALTlME' data = urllib.parse.urlencode(data).encode('utf-8') response = urllib.request.urlopen(url,data) html = response.read().decode('utf-8') req = json.loads(html) result = req['translateResult'][0][0]['tgt'] print(result)
但是這個程序只能翻譯一個單詞,用完就廢了。于是我在進行優(yōu)化
import urllib.request import urllib.parse import json def translate(): centens = input('輸入要翻譯的語句:') url = 'http://fanyi.youdao.com/translate?smartresult=dict&smartresult=rule' head = {}#增加請求頭,防反爬蟲 head['User-Agent'] = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36' data = {}#帶上from data的數(shù)據(jù)進行請求 data['i'] = centens data['from'] = 'AUTO' data['to'] = 'AUTO' data['smartresult'] = 'dict' data['client'] = 'fanyideskweb' data['salt'] = '16057996372935' data['sign'] = '0965172abb459f8c7a791df4184bf51c' data['lts'] = '1605799637293' data['bv'] = 'f7d97c24a497388db1420108e6c3537b' data['doctype'] = 'json' data['version'] = '2.1' data['keyfrom'] = 'fanyi.web' data['action'] = 'FY_BY_REALTlME' data = urllib.parse.urlencode(data).encode('utf-8') req = urllib.request.Request(url,data,head) response = urllib.request.urlopen(req) html = response.read().decode('utf-8') req = json.loads(html) result = req['translateResult'][0][0]['tgt'] # print(f'中英互譯的結果:{result}') return result t = translate() print(f'中英互譯的結果:{t}')
優(yōu)化完成,效果還行。
以上就是本文的全部內容,希望對大家的學習有所幫助,也希望大家多多支持腳本之家。
相關文章
python中的生成器實現(xiàn)周期性報文發(fā)送功能
本文主要介紹了python中的生成器實現(xiàn)周期性報文發(fā)送功能,文中通過示例代碼介紹的非常詳細,對大家的學習或者工作具有一定的參考學習價值,需要的朋友們下面隨著小編來一起學習學習吧2023-03-03Python?Requests?基本使用及Requests與?urllib?區(qū)別
在使用Python爬蟲時,需要模擬發(fā)起網(wǎng)絡請求,主要用到的庫有requests庫和python內置的urllib庫,一般建議使用requests,它是對urllib的再次封裝,今天通過本文給大家講解Python?Requests使用及urllib區(qū)別,感興趣的朋友一起看看吧2022-11-11Python?常用的print輸出函數(shù)和input輸入函數(shù)
這篇文章主要介紹了Python?常用的print輸出函數(shù)和input輸入函數(shù),今天主要學習一下Python中的輸入輸出流,會對標準輸入輸出流、文件輸入輸出流展開介紹,需要的朋友可以參考一下2022-02-02Linux 發(fā)郵件磁盤空間監(jiān)控(python)
這篇文章主要介紹了Linux發(fā)郵件磁盤空間監(jiān)控功能,python實現(xiàn),需要的朋友可以參考下2016-04-04