python+selenium識別驗證碼并登錄的示例代碼

更新時間：2017年12月21日 13:51:17 作者：ck3207

本篇文章主要介紹了python+selenium識別驗證碼并登錄的示例代碼，小編覺得挺不錯的，現(xiàn)在分享給大家，也給大家做個參考。一起跟隨小編過來看看吧

由于工作需要，登錄網(wǎng)站需要用到驗證碼。最初是研究過驗證碼識別的，但是總是不能獲取到我需要的那個驗證碼。直到這周五，才想起這事來，昨天順利的解決了。

下面正題：

python版本：3.4.3

所需要的代碼庫：PIL，selenium，tesseract

先上代碼：

#coding:utf-8
import subprocess
from PIL import Image
from PIL import ImageOps
from selenium import webdriver
import time,os,sys
def cleanImage(imagePath):
  image = Image.open(imagePath)  #打開圖片
  image = image.point(lambda x: 0 if x<143 else 255) #處理圖片上的每個像素點，使圖片上每個點“非黑即白”
  borderImage = ImageOps.expand(image,border=20,fill='white')
  borderImage.save(imagePath)

def getAuthCode(driver, url="http://localhost/"):
  captchaUrl = url + "common/random"
  driver.get(captchaUrl) 
  time.sleep(0.5)
  driver.save_screenshot("captcha.jpg")  #截屏，并保存圖片
  #urlretrieve(captchaUrl, "captcha.jpg")
  time.sleep(0.5)
  cleanImage("captcha.jpg")
  p = subprocess.Popen(["tesseract", "captcha.jpg", "captcha"], stdout=\
             subprocess.PIPE,stderr=subprocess.PIPE)
  p.wait()
  f = open("captcha.txt", "r")
  
  #Clean any whitespace characters
  captchaResponse = f.read().replace(" ", "").replace("\n", "")
  print("Captcha solution attempt: " + captchaResponse)
  if len(captchaResponse) == 4:
    return captchaResponse
  else:
    return False

def withoutCookieLogin(url="http://org.cfu666.com/"):
  driver = webdriver.Chrome()
  driver.maximize_window()
  driver.get(url)
  while True:   
    authCode = getAuthCode(driver, url)
    if authCode:
      driver.back()
      driver.find_element_by_xpath("http://input[@id='orgCode' and @name='orgCode']").clear()
      driver.find_element_by_xpath("http://input[@id='orgCode' and @name='orgCode']").send_keys("orgCode")
      driver.find_element_by_xpath("http://input[@id='account' and @name='username']").clear()
      driver.find_element_by_xpath("http://input[@id='account' and @name='username']").send_keys("username")
      driver.find_element_by_xpath("http://input[@type='password' and @name='password']").clear()
      driver.find_element_by_xpath("http://input[@type='password' and @name='password']").send_keys("password")       
      driver.find_element_by_xpath("http://input[@type='text' and @name='authCode']").send_keys(authCode)
      driver.find_element_by_xpath("http://button[@type='submit']").click()
      try:
        time.sleep(3)
        driver.find_element_by_xpath("http://*[@id='side-menu']/li[2]/ul/li/a").click()
        return driver
      except:
        print("authCode Error:", authCode)
        driver.refresh()
  return driver
driver = withoutCookieLogin("http://localhost/")
driver.get("http://localhost/enterprise/add/")

怎么獲取我們需要的驗證碼

在這獲取驗證碼的道路上，我掉了太多的坑，看過太多的文章，很多都是教你驗證碼的識別方法，但是沒有說明，怎么獲取你當前需要的驗證碼圖片。

我的處理方法是：

1.先用selenium打開你需要的登錄的頁面地址url1