快捷導(dǎo)航

python3 googletrans超時報錯問題及翻譯工具優(yōu)化方案附源碼

更新時間：2020年12月23日 11:47:39 作者：懷淰メ

這篇文章主要介紹了python3 googletrans超時報錯問題及翻譯工具優(yōu)化方案附源碼,本文給大家分享解決方法，通過實例代碼相結(jié)合給大家介紹的非常詳細(xì)，需要的朋友可以參考下

一. 問題：

在寫調(diào)用谷歌翻譯接口的腳本時，老是報錯，我使用的的是googletrans這個模塊中Translator的translate方法，程序運(yùn)行以后會報訪問超時錯誤:

Traceback (most recent call last): File "E:/PycharmProjects/MyProject/Translate/translate_test.py", line 3, in <module> result=translator.translate('안녕하세요.') File "D:\python3\lib\site-packages\googletrans\client.py", line 182, in translate data = self._translate(text, dest, src, kwargs) File "D:\python3\lib\site-packages\googletrans\client.py", line 78, in _translate token = self.token_acquirer.do(text) File "D:\python3\lib\site-packages\googletrans\gtoken.py", line 194, in do self._update() File "D:\python3\lib\site-packages\googletrans\gtoken.py", line 54, in _update r = self.client.get(self.host) File "D:\python3\lib\site-packages\httpx\_client.py", line 763, in get timeout=timeout, File "D:\python3\lib\site-packages\httpx\_client.py", line 601, in request request, auth=auth, allow_redirects=allow_redirects, timeout=timeout, File "D:\python3\lib\site-packages\httpx\_client.py", line 621, in send request, auth=auth, timeout=timeout, allow_redirects=allow_redirects, File "D:\python3\lib\site-packages\httpx\_client.py", line 648, in send_handling_redirects request, auth=auth, timeout=timeout, history=history File "D:\python3\lib\site-packages\httpx\_client.py", line 684, in send_handling_auth response = self.send_single_request(request, timeout) File "D:\python3\lib\site-packages\httpx\_client.py", line 719, in send_single_request timeout=timeout.as_dict(), File "D:\python3\lib\site-packages\httpcore\_sync\connection_pool.py", line 153, in request method, url, headers=headers, stream=stream, timeout=timeout File "D:\python3\lib\site-packages\httpcore\_sync\connection.py", line 65, in request self.socket = self._open_socket(timeout) File "D:\python3\lib\site-packages\httpcore\_sync\connection.py", line 86, in _open_socket hostname, port, ssl_context, timeout File "D:\python3\lib\site-packages\httpcore\_backends\sync.py", line 139, in open_tcp_stream return SyncSocketStream(sock=sock) File "D:\python3\lib\contextlib.py", line 130, in __exit__ self.gen.throw(type, value, traceback) File "D:\python3\lib\site-packages\httpcore\_exceptions.py", line 12, in map_exceptions raise to_exc(exc) from None httpcore._exceptions.ConnectTimeout: timed out

二. 解決方法：

1.尋找解決方法

經(jīng)過多方資料查找，最后才知道google翻譯對接口進(jìn)行了更新，之前用的googletrans已經(jīng)不能用了。但是網(wǎng)上大神已經(jīng)開發(fā)出了新的方法

https://github.com/lushan88a/google_trans_new

在此道一聲感謝！

2.使用解決方法

在cmd中輸入以下指令即可。

pip install google_trans_new

三. 代碼（優(yōu)化）

from google_trans_new import google_translator
from multiprocessing.dummy import Pool as ThreadPool
import time
import re
"""
此版本調(diào)用最新版google_trans_new
使用多線程訪問谷歌翻譯接口
能夠翻譯len(text)>5000的文本
"""
class Translate(object):
 def __init__(self):
 	#初始化翻譯文本路徑以及翻譯目標(biāo)語言
  self.txt_file='./test.txt'
  self.aim_language='zh-CN'
  
	#讀入要翻譯的文本文件
 def read_txt(self):
  with open(self.txt_file, 'r',encoding='utf-8')as f:
   txt = f.readlines()
  return txt
	
	#進(jìn)行文本處理，此為優(yōu)化
 def cut_text(self,text):
  #如果只是一行，就切割成5000字一次來翻譯
  if len(text)==1:
   str_text = ''.join(text).strip()
   #篩選是一行但是文本長度大于5000
   if len(str_text)>5000:
    #使用正則表達(dá)式切割超長文本為5000一段的短文本
    result = re.findall('.{5000}', str_text)
    return result
   else:
    #如果文本為一行但是這一行文本長度小于5000，則直接返回text
    return text
   """
   如果不止一行，加以判斷
    (1)每行字符數(shù)都小于5000
   （2）有的行字符數(shù)小于5000，有的行字符數(shù)大于5000
   """
  else:
   result = []
   for line in text:
    #第（1）種情況
    if len(line)<5000:
     result.append(line)
    else:
     # 第（2）種情況，切割以后，追加到列表中
     cut_str=re.findall('.{5000}', line)
     result.extend(cut_str)
   return result

 def translate(self,text):
  if text:
   aim_lang = self.aim_language
   try:
	   t = google_translator(timeout=10)
	   translate_text = t.translate(text, aim_lang)
	   print(translate_text)
	   return translate_text
   except Exception as e:
    print(e)

def main():
 time1=time.time()
 #開啟八條線程
 pool = ThreadPool(8)
 trans = Translate()
 txt = trans.read_txt()
 texts = trans.cut_text(txt)
 try:
  pool.map(trans.translate, texts)
 except Exception as e:
  raise e
 pool.close()
 pool.join()
 time2 = time.time()
 print("一共翻譯了 {} 個句子，消耗了 {:.2f} s".format(len(texts),time2 - time1))

if __name__ == "__main__" :
 main()

測試文本我放在了：http://xiazai.jb51.net/202012/yuanma/test.rar

可自行下載。