欧美bbbwbbbw肥妇,免费乱码人妻系列日韩,一级黄片

Python搭建Keras CNN模型破解網(wǎng)站驗(yàn)證碼的實(shí)現(xiàn)

 更新時(shí)間:2020年04月07日 10:31:28   作者:不脫發(fā)的程序猿  
這篇文章主要介紹了Python搭建Keras CNN模型破解網(wǎng)站驗(yàn)證碼的實(shí)現(xiàn),文中通過示例代碼介紹的非常詳細(xì),對(duì)大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價(jià)值,需要的朋友們下面隨著小編來一起學(xué)習(xí)學(xué)習(xí)吧

在本項(xiàng)目中,將會(huì)用Keras來搭建一個(gè)稍微復(fù)雜的CNN模型來破解以上的驗(yàn)證碼。驗(yàn)證碼如下:

 利用Keras可以快速方便地搭建CNN模型,本項(xiàng)目搭建的CNN模型如下:

將數(shù)據(jù)集分為訓(xùn)練集和測(cè)試集,占比為8:2,該模型訓(xùn)練的代碼如下: 

# -*- coding: utf-8 -*-
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from matplotlib import pyplot as plt
 
from keras.utils import np_utils, plot_model
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.callbacks import EarlyStopping
from keras.layers import Conv2D, MaxPooling2D
 
# 讀取數(shù)據(jù)
df = pd.read_csv('./data.csv')
 
# 標(biāo)簽值
vals = range(31)
keys = ['1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','J','K','L','N','P','Q','R','S','T','U','V','X','Y','Z']
label_dict = dict(zip(keys, vals))
 
x_data = df[['v'+str(i+1) for i in range(320)]]
y_data = pd.DataFrame({'label':df['label']})
y_data['class'] = y_data['label'].apply(lambda x: label_dict[x])
 
# 將數(shù)據(jù)分為訓(xùn)練集和測(cè)試集
X_train, X_test, Y_train, Y_test = train_test_split(x_data, y_data['class'], test_size=0.3, random_state=42)
x_train = np.array(X_train).reshape((1167, 20, 16, 1))
x_test = np.array(X_test).reshape((501, 20, 16, 1))
 
# 對(duì)標(biāo)簽值進(jìn)行one-hot encoding
n_classes = 31
y_train = np_utils.to_categorical(Y_train, n_classes)
y_val = np_utils.to_categorical(Y_test, n_classes)
 
input_shape = x_train[0].shape
 
# CNN模型
model = Sequential()
 
# 卷積層和池化層
model.add(Conv2D(32, kernel_size=(3, 3), input_shape=input_shape, padding='same'))
model.add(Activation('relu'))
model.add(Conv2D(32, kernel_size=(3, 3), padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2), padding='same'))
 
# Dropout層
model.add(Dropout(0.25))
 
model.add(Conv2D(64, kernel_size=(3, 3), padding='same'))
model.add(Activation('relu'))
model.add(Conv2D(64, kernel_size=(3, 3), padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2), padding='same'))
 
model.add(Dropout(0.25))
 
model.add(Conv2D(128, kernel_size=(3, 3), padding='same'))
model.add(Activation('relu'))
model.add(Conv2D(128, kernel_size=(3, 3), padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2), padding='same'))
 
model.add(Dropout(0.25))
 
model.add(Flatten())
 
# 全連接層
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(128, activation='relu'))
model.add(Dense(n_classes, activation='softmax'))
 
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
 
# plot model
##plot_model(model, to_file=r'./model.png', show_shapes=True)
 
# 模型訓(xùn)練
callbacks = [EarlyStopping(monitor='val_acc', patience=5, verbose=1)]
batch_size = 64
n_epochs = 100
history = model.fit(x_train, y_train, batch_size=batch_size, epochs=n_epochs, \
          verbose=1, validation_data=(x_test, y_val), callbacks=callbacks)
 
mp = './verifycode_Keras.h5'
model.save(mp)
 
# 繪制驗(yàn)證集上的準(zhǔn)確率曲線
val_acc = history.history['val_acc']
plt.plot(range(len(val_acc)), val_acc, label='CNN model')
plt.title('Validation accuracy on verifycode dataset')
plt.xlabel('epochs')
plt.ylabel('accuracy')
plt.legend()
plt.show()

在上述代碼中,訓(xùn)練模型的時(shí)候采用了early stopping技巧。early stopping是用于提前停止訓(xùn)練的callbacks。具體地,可以達(dá)到當(dāng)訓(xùn)練集上的loss不在減?。礈p小的程度小于某個(gè)閾值)的時(shí)候停止繼續(xù)訓(xùn)練。 

運(yùn)行上述模型訓(xùn)練代碼,輸出的結(jié)果如下:

......(忽略之前的輸出)
Epoch 22/100
 
 64/1167 [>.............................] - ETA: 3s - loss: 0.0399 - acc: 1.0000
 128/1167 [==>...........................] - ETA: 3s - loss: 0.1195 - acc: 0.9844
 192/1167 [===>..........................] - ETA: 2s - loss: 0.1085 - acc: 0.9792
 256/1167 [=====>........................] - ETA: 2s - loss: 0.1132 - acc: 0.9727
 320/1167 [=======>......................] - ETA: 2s - loss: 0.1045 - acc: 0.9750
 384/1167 [========>.....................] - ETA: 2s - loss: 0.1006 - acc: 0.9740
 448/1167 [==========>...................] - ETA: 2s - loss: 0.1522 - acc: 0.9643
 512/1167 [============>.................] - ETA: 1s - loss: 0.1450 - acc: 0.9648
 576/1167 [=============>................] - ETA: 1s - loss: 0.1368 - acc: 0.9653
 640/1167 [===============>..............] - ETA: 1s - loss: 0.1353 - acc: 0.9641
 704/1167 [=================>............] - ETA: 1s - loss: 0.1280 - acc: 0.9659
 768/1167 [==================>...........] - ETA: 1s - loss: 0.1243 - acc: 0.9674
 832/1167 [====================>.........] - ETA: 0s - loss: 0.1577 - acc: 0.9639
 896/1167 [======================>.......] - ETA: 0s - loss: 0.1488 - acc: 0.9665
 960/1167 [=======================>......] - ETA: 0s - loss: 0.1488 - acc: 0.9656
1024/1167 [=========================>....] - ETA: 0s - loss: 0.1427 - acc: 0.9668
1088/1167 [==========================>...] - ETA: 0s - loss: 0.1435 - acc: 0.9669
1152/1167 [============================>.] - ETA: 0s - loss: 0.1383 - acc: 0.9688
1167/1167 [==============================] - 4s 3ms/step - loss: 0.1380 - acc: 0.9683 - val_loss: 0.0835 - val_acc: 0.9760
Epoch 00022: early stopping

可以看到,花費(fèi)幾分鐘,一共訓(xùn)練了21次,最近一次的訓(xùn)練后,在測(cè)試集上的準(zhǔn)確率為96.83%。在測(cè)試集的準(zhǔn)確率曲線如下圖:

模型訓(xùn)練完后,我們對(duì)新的驗(yàn)證碼進(jìn)行預(yù)測(cè)。新的100張驗(yàn)證碼如下圖: 

使用訓(xùn)練好的CNN模型,對(duì)這些新的驗(yàn)證碼進(jìn)行預(yù)測(cè),預(yù)測(cè)的Python代碼如下:

# -*- coding: utf-8 -*-
 
import os
import cv2
import numpy as np
 
def split_picture(imagepath):
 
  # 以灰度模式讀取圖片
  gray = cv2.imread(imagepath, 0)
 
  # 將圖片的邊緣變?yōu)榘咨?
  height, width = gray.shape
  for i in range(width):
    gray[0, i] = 255
    gray[height-1, i] = 255
  for j in range(height):
    gray[j, 0] = 255
    gray[j, width-1] = 255
 
  # 中值濾波
  blur = cv2.medianBlur(gray, 3) #模板大小3*3
 
  # 二值化
  ret,thresh1 = cv2.threshold(blur, 200, 255, cv2.THRESH_BINARY)
 
  # 提取單個(gè)字符
  chars_list = []
  image, contours, hierarchy = cv2.findContours(thresh1, 2, 2)
  for cnt in contours:
    # 最小的外接矩形
    x, y, w, h = cv2.boundingRect(cnt)
    if x != 0 and y != 0 and w*h >= 100:
      chars_list.append((x,y,w,h))
 
  sorted_chars_list = sorted(chars_list, key=lambda x:x[0])
  for i,item in enumerate(sorted_chars_list):
    x, y, w, h = item
    cv2.imwrite('test_verifycode/%d.jpg'%(i+1), thresh1[y:y+h, x:x+w])
 
def remove_edge_picture(imagepath):
 
  image = cv2.imread(imagepath, 0)
  height, width = image.shape
  corner_list = [image[0,0] < 127,
          image[height-1, 0] < 127,
          image[0, width-1]<127,
          image[ height-1, width-1] < 127
          ]
  if sum(corner_list) >= 3:
    os.remove(imagepath)
 
def resplit_with_parts(imagepath, parts):
  image = cv2.imread(imagepath, 0)
  os.remove(imagepath)
  height, width = image.shape
 
  file_name = imagepath.split('/')[-1].split(r'.')[0]
  # 將圖片重新分裂成parts部分
  step = width//parts   # 步長(zhǎng)
  start = 0       # 起始位置
  for i in range(parts):
    cv2.imwrite('./test_verifycode/%s.jpg'%(file_name+'-'+str(i)), \
          image[:, start:start+step])
    start += step
 
def resplit(imagepath):
 
  image = cv2.imread(imagepath, 0)
  height, width = image.shape
 
  if width >= 64:
    resplit_with_parts(imagepath, 4)
  elif width >= 48:
    resplit_with_parts(imagepath, 3)
  elif width >= 26:
    resplit_with_parts(imagepath, 2)
 
# rename and convert to 16*20 size
def convert(dir, file):
 
  imagepath = dir+'/'+file
  # 讀取圖片
  image = cv2.imread(imagepath, 0)
  # 二值化
  ret, thresh = cv2.threshold(image, 127, 255, cv2.THRESH_BINARY)
  img = cv2.resize(thresh, (16, 20), interpolation=cv2.INTER_AREA)
  # 保存圖片
  cv2.imwrite('%s/%s' % (dir, file), img)
 
# 讀取圖片的數(shù)據(jù),并轉(zhuǎn)化為0-1值
def Read_Data(dir, file):
 
  imagepath = dir+'/'+file
  # 讀取圖片
  image = cv2.imread(imagepath, 0)
  # 二值化
  ret, thresh = cv2.threshold(image, 127, 255, cv2.THRESH_BINARY)
  # 顯示圖片
  bin_values = [1 if pixel==255 else 0 for pixel in thresh.ravel()]
 
  return bin_values
 
def predict(VerifyCodePath):
 
  dir = './test_verifycode'
  files = os.listdir(dir)
 
  # 清空原有的文件
  if files:
    for file in files:
      os.remove(dir + '/' + file)
 
  split_picture(VerifyCodePath)
 
  files = os.listdir(dir)
  if not files:
    print('查看的文件夾為空!')
  else:
 
    # 去除噪聲圖片
    for file in files:
      remove_edge_picture(dir + '/' + file)
 
    # 對(duì)黏連圖片進(jìn)行重分割
    for file in os.listdir(dir):
      resplit(dir + '/' + file)
 
    # 將圖片統(tǒng)一調(diào)整至16*20大小
    for file in os.listdir(dir):
      convert(dir, file)
 
    # 圖片中的字符代表的向量
    files = sorted(os.listdir(dir), key=lambda x: x[0])
    table = np.array([Read_Data(dir, file) for file in files]).reshape(-1,20,16,1)
 
    # 模型保存地址
    mp = './verifycode_Keras.h5'
    # 載入模型
    from keras.models import load_model
    cnn = load_model(mp)
    # 模型預(yù)測(cè)
    y_pred = cnn.predict(table)
    predictions = np.argmax(y_pred, axis=1)
 
    # 標(biāo)簽字典
    keys = range(31)
    vals = ['1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K', 'L', 'N',
        'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'X', 'Y', 'Z']
    label_dict = dict(zip(keys, vals))
 
    return ''.join([label_dict[pred] for pred in predictions])
 
def main():
 
  dir = './VerifyCode/'
  correct = 0
  for i, file in enumerate(os.listdir(dir)):
    true_label = file.split('.')[0]
    VerifyCodePath = dir+file
    pred = predict(VerifyCodePath)
 
    if true_label == pred:
      correct += 1
    print(i+1, (true_label, pred), true_label == pred, correct)
 
  total = len(os.listdir(dir))
  print('\n總共圖片:%d張\n識(shí)別正確:%d張\n識(shí)別準(zhǔn)確率:%.2f%%.'\
     %(total, correct, correct*100/total))
 
main()

以下是該CNN模型的預(yù)測(cè)結(jié)果:

Using TensorFlow backend.
2018-10-25 15:13:50.390130: I C: f_jenkinsworkspace
el-winMwindowsPY35 ensorflowcoreplatformcpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
1 ('ZK6N', 'ZK6N') True 1
2 ('4JPX', '4JPX') True 2
3 ('5GP5', '5GP5') True 3
4 ('5RQ8', '5RQ8') True 4
5 ('5TQP', '5TQP') True 5
6 ('7S62', '7S62') True 6
7 ('8R2Z', '8R2Z') True 7
8 ('8RFV', '8RFV') True 8
9 ('9BBT', '9BBT') True 9
10 ('9LNE', '9LNE') True 10
11 ('67UH', '67UH') True 11
12 ('74UK', '74UK') True 12
13 ('A5T2', 'A5T2') True 13
14 ('AHYV', 'AHYV') True 14
15 ('ASEY', 'ASEY') True 15
16 ('B371', 'B371') True 16
17 ('CCQL', 'CCQL') True 17
18 ('CFD5', 'GFD5') False 17
19 ('CJLJ', 'CJLJ') True 18
20 ('D4QV', 'D4QV') True 19
21 ('DFQ8', 'DFQ8') True 20
22 ('DP18', 'DP18') True 21
23 ('E3HC', 'E3HC') True 22
24 ('E8VB', 'E8VB') True 23
25 ('DE1U', 'DE1U') True 24
26 ('FK1R', 'FK1R') True 25
27 ('FK91', 'FK91') True 26
28 ('FSKP', 'FSKP') True 27
29 ('FVZP', 'FVZP') True 28
30 ('GC6H', 'GC6H') True 29
31 ('GH62', 'GH62') True 30
32 ('H9FQ', 'H9FQ') True 31
33 ('H67Q', 'H67Q') True 32
34 ('HEKC', 'HEKC') True 33
35 ('HV2B', 'HV2B') True 34
36 ('J65Z', 'J65Z') True 35
37 ('JZCX', 'JZCX') True 36
38 ('KH5D', 'KH5D') True 37
39 ('KXD2', 'KXD2') True 38
40 ('1GDH', '1GDH') True 39
41 ('LCL3', 'LCL3') True 40
42 ('LNZR', 'LNZR') True 41
43 ('LZU5', 'LZU5') True 42
44 ('N5AK', 'N5AK') True 43
45 ('N5Q3', 'N5Q3') True 44
46 ('N96Z', 'N96Z') True 45
47 ('NCDG', 'NCDG') True 46
48 ('NELS', 'NELS') True 47
49 ('P96U', 'P96U') True 48
50 ('PD42', 'PD42') True 49
51 ('PECG', 'PEQG') False 49
52 ('PPZF', 'PPZF') True 50
53 ('PUUL', 'PUUL') True 51
54 ('Q2DN', 'D2DN') False 51
55 ('QCQ9', 'QCQ9') True 52
56 ('QDB1', 'QDBJ') False 52
57 ('QZUD', 'QZUD') True 53
58 ('R3T5', 'R3T5') True 54
59 ('S1YT', 'S1YT') True 55
60 ('SP7L', 'SP7L') True 56
61 ('SR2K', 'SR2K') True 57
62 ('SUP5', 'SVP5') False 57
63 ('T2SP', 'T2SP') True 58
64 ('U6V9', 'U6V9') True 59
65 ('UC9P', 'UC9P') True 60
66 ('UFYD', 'UFYD') True 61
67 ('V9NJ', 'V9NH') False 61
68 ('V35X', 'V35X') True 62
69 ('V98F', 'V98F') True 63
70 ('VD28', 'VD28') True 64
71 ('YGHE', 'YGHE') True 65
72 ('YNKD', 'YNKD') True 66
73 ('YVXV', 'YVXV') True 67
74 ('ZFBS', 'ZFBS') True 68
75 ('ET6X', 'ET6X') True 69
76 ('TKVC', 'TKVC') True 70
77 ('2UCU', '2UCU') True 71
78 ('HNBK', 'HNBK') True 72
79 ('X8FD', 'X8FD') True 73
80 ('ZGNX', 'ZGNX') True 74
81 ('LQCU', 'LQCU') True 75
82 ('JNZY', 'JNZVY') False 75
83 ('RX34', 'RX34') True 76
84 ('811E', '811E') True 77
85 ('ETDX', 'ETDX') True 78
86 ('4CPR', '4CPR') True 79
87 ('FE91', 'FE91') True 80
88 ('B7XH', 'B7XH') True 81
89 ('1RUA', '1RUA') True 82
90 ('UBCX', 'UBCX') True 83
91 ('KVT5', 'KVT5') True 84
92 ('HZ3A', 'HZ3A') True 85
93 ('3XLR', '3XLR') True 86
94 ('VC7T', 'VC7T') True 87
95 ('7PG1', '7PQ1') False 87
96 ('4F21', '4F21') True 88
97 ('3HLJ', '3HLJ') True 89
98 ('1KT7', '1KT7') True 90
99 ('1RHE', '1RHE') True 91
100 ('1TTA', '1TTA') True 92

總共圖片:100張
識(shí)別正確:92張
識(shí)別準(zhǔn)確率:92.00%.

可以看到,該訓(xùn)練后的CNN模型,其預(yù)測(cè)新驗(yàn)證的準(zhǔn)確率在90%以上。

Demo及數(shù)據(jù)集下載網(wǎng)站:CNN_4_Verifycode_jb51.rar

到此這篇關(guān)于Python搭建Keras CNN模型破解網(wǎng)站驗(yàn)證碼的實(shí)現(xiàn)的文章就介紹到這了,更多相關(guān)Python Keras CNN破解網(wǎng)站驗(yàn)證碼內(nèi)容請(qǐng)搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家!

相關(guān)文章

  • python實(shí)現(xiàn)自動(dòng)登錄人人網(wǎng)并訪問最近來訪者實(shí)例

    python實(shí)現(xiàn)自動(dòng)登錄人人網(wǎng)并訪問最近來訪者實(shí)例

    這篇文章主要介紹了python實(shí)現(xiàn)自動(dòng)登錄人人網(wǎng)并訪問最近來訪者實(shí)例,該實(shí)例是在前面登錄人人網(wǎng)實(shí)例基礎(chǔ)上的擴(kuò)展,是非常實(shí)用的一個(gè)技巧,需要的朋友可以參考下
    2014-09-09
  • Python 功能和特點(diǎn)(新手必學(xué))

    Python 功能和特點(diǎn)(新手必學(xué))

    Python是一門簡(jiǎn)單而文字簡(jiǎn)約的語言。閱讀好的Python程序感覺就像閱讀英語,盡管是非常嚴(yán)格的英語。Python的這種偽代碼特性是其最大強(qiáng)項(xiàng)之一,它可讓你專注于解決問題的辦法而不是語言本身,通過本篇文章給大家介紹python功能和特點(diǎn)相關(guān)知識(shí),感興趣的朋友一起學(xué)習(xí)吧
    2015-12-12
  • 利用python制作拼圖小游戲的全過程

    利用python制作拼圖小游戲的全過程

    這篇文章主要給大家介紹了關(guān)于利用python制作拼圖小游戲的相關(guān)資料,文中通過示例代碼介紹的非常詳細(xì),對(duì)大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價(jià)值,需要的朋友們下面隨著小編來一起學(xué)習(xí)學(xué)習(xí)吧
    2020-12-12
  • 將matplotlib繪圖嵌入pyqt的方法示例

    將matplotlib繪圖嵌入pyqt的方法示例

    這篇文章主要介紹了將matplotlib繪圖嵌入pyqt的方法示例,文中通過示例代碼介紹的非常詳細(xì),對(duì)大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價(jià)值,需要的朋友們下面隨著小編來一起學(xué)習(xí)學(xué)習(xí)吧
    2020-01-01
  • Python中如何生成GeoJSON數(shù)據(jù)

    Python中如何生成GeoJSON數(shù)據(jù)

    這篇文章主要介紹了Python中生成GeoJSON數(shù)據(jù),無論使用geojson庫(kù)還是geopandas庫(kù),都可以生成包含地理空間數(shù)據(jù)的GeoJSON文件,文中介紹了使用這些庫(kù)生成GeoJSON數(shù)據(jù)的簡(jiǎn)單示例,需要的朋友可以參考下
    2023-10-10
  • Windows下python3.6.4安裝教程

    Windows下python3.6.4安裝教程

    這篇文章主要為大家詳細(xì)介紹了Windows下python3.6.4安裝教程,具有一定的參考價(jià)值,感興趣的小伙伴們可以參考一下
    2018-07-07
  • Pytorch保存模型用于測(cè)試和用于繼續(xù)訓(xùn)練的區(qū)別詳解

    Pytorch保存模型用于測(cè)試和用于繼續(xù)訓(xùn)練的區(qū)別詳解

    今天小編就為大家分享一篇Pytorch保存模型用于測(cè)試和用于繼續(xù)訓(xùn)練的區(qū)別詳解,具有很好的參考價(jià)值,希望對(duì)大家有所幫助。一起跟隨小編過來看看吧
    2020-01-01
  • Python3環(huán)境安裝Scrapy爬蟲框架過程及常見錯(cuò)誤

    Python3環(huán)境安裝Scrapy爬蟲框架過程及常見錯(cuò)誤

    這篇文章主要介紹了Python3環(huán)境安裝Scrapy爬蟲框架過程及常見錯(cuò)誤 ,本文給大家介紹的非常不錯(cuò),具有一定的參考借鑒價(jià)值,需要的朋友可以參考下
    2019-07-07
  • python二進(jìn)制讀寫及特殊碼同步實(shí)現(xiàn)詳解

    python二進(jìn)制讀寫及特殊碼同步實(shí)現(xiàn)詳解

    這篇文章主要介紹了python二進(jìn)制讀寫及特殊碼同步實(shí)現(xiàn)詳解,文中通過示例代碼介紹的非常詳細(xì),對(duì)大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價(jià)值,需要的朋友可以參考下
    2019-10-10
  • Python telnet登陸功能實(shí)現(xiàn)代碼

    Python telnet登陸功能實(shí)現(xiàn)代碼

    這篇文章主要介紹了Python telnet登陸功能實(shí)現(xiàn)代碼,文中通過示例代碼介紹的非常詳細(xì),對(duì)大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價(jià)值,需要的朋友可以參考下
    2020-04-04

最新評(píng)論