快捷導(dǎo)航

Keras搭建自編碼器操作

更新時(shí)間：2020年07月03日 09:11:21 作者：經(jīng)年不往

這篇文章主要介紹了Keras搭建自編碼器操作，具有很好的參考價(jià)值，希望對大家有所幫助。一起跟隨小編過來看看吧

簡介：

傳統(tǒng)機(jī)器學(xué)習(xí)任務(wù)任務(wù)很大程度上依賴于好的特征工程，但是特征工程往往耗時(shí)耗力，在視頻、語音和視頻中提取到有效特征就更難了，工程師必須在這些領(lǐng)域有非常深入的理解，并且需要使用專業(yè)算法提取這些數(shù)據(jù)的特征。深度學(xué)習(xí)則可以解決人工難以提取有效特征的問題，大大緩解機(jī)器學(xué)習(xí)模型對特征工程的依賴。

深度學(xué)習(xí)在早期一度被認(rèn)為是一種無監(jiān)督的特征學(xué)習(xí)過程，模仿人腦對特征逐層抽象的過程。這其中兩點(diǎn)很重要：一是無監(jiān)督學(xué)習(xí)；二是逐層訓(xùn)練。例如在圖像識別問題中，假定我們有許多汽車圖片，要如何利用計(jì)算機(jī)進(jìn)行識別任務(wù)呢？如果從像素級開始進(jìn)行訓(xùn)練分類器，那么絕大多數(shù)算法很難工作。如果我們提取高階特征，比如汽車的車輪、汽車的車窗、車身等。那么就可以使用這些高階特征非常準(zhǔn)確的對圖像進(jìn)行分類。不過高階特征都是由底層特征組成，這便是深度學(xué)習(xí)訓(xùn)練過程中所做的特征學(xué)習(xí)。

早年有學(xué)者發(fā)現(xiàn)，可以使用少量的基本特征進(jìn)行組合拼裝得到更高層抽象的特征，這其實(shí)就是我們常說的特征的稀疏表達(dá)。對圖像任務(wù)來說，一張?jiān)紙D片可以由較少的圖片碎片組合得到。對語音識別任務(wù)來講，絕大多數(shù)的聲音也可以由一些基本的結(jié)構(gòu)線性組合得到。對人臉識別任務(wù)來說，根據(jù)不同的器官，如：鼻子、嘴、眉毛、眼睛瞪，這些器官可以向上拼出不同樣式的人臉，最后模型通過在圖片中匹配這些不同樣式的人臉來進(jìn)行識別。在深度神經(jīng)網(wǎng)絡(luò)中，對每一層神經(jīng)網(wǎng)絡(luò)來說前一層的輸出都是未加工的像素，而這一層則是對像素進(jìn)行加工組織成更高階的特征的過程（即前面提到過的圖片碎片進(jìn)行線性組合加工的過程）。

根據(jù)上述基本概念的描述，特征是可以不斷抽象轉(zhuǎn)為高一層特征的，那我們?nèi)绾握业竭@些基本結(jié)構(gòu)，然后如何抽象？這里引出無監(jiān)督的自編碼器來提取特征。自編碼器--顧名思義，可以使用自身高階特征編碼自己。它的輸入和輸出是一致的。因此，它的基本思想是使用稀疏一些高階特征重新組合來重構(gòu)自己。自編碼器的剛開始提出是Hinton在Science上發(fā)表文章，用來解決數(shù)據(jù)降維問題。此外，Hinton還提出了基于深度信念網(wǎng)絡(luò)的無監(jiān)督逐層訓(xùn)練的貪心算法，為訓(xùn)練很深的網(wǎng)絡(luò)提供了一個(gè)可行的方案。深度信念網(wǎng)絡(luò)的提出是使用逐層訓(xùn)練的方式提取特征，使得在有監(jiān)督學(xué)習(xí)任務(wù)之前，使得網(wǎng)絡(luò)權(quán)重初始化到一個(gè)比較好的位置。其思想與自編碼器的非常相似。在此基礎(chǔ)上，國內(nèi)外學(xué)者又提出了自編碼器的各種版本，如：稀疏自編碼器、去噪自編碼器等。

本文使用Keras深度學(xué)習(xí)開發(fā)庫，在MNIST數(shù)據(jù)集上實(shí)現(xiàn)了簡單自編碼器、深度稀疏自編碼器和卷積自編碼器。

自編碼器用途：

目前自編碼器的應(yīng)用主要有兩個(gè)方面，第一是數(shù)據(jù)去噪，第二是為進(jìn)行可視化而降維。配合適當(dāng)?shù)木S度和稀疏約束，自編碼器可以學(xué)習(xí)到比PCA等技術(shù)更有意思的數(shù)據(jù)投影。此外，在數(shù)據(jù)共有特征建模方面，也有叫廣泛的應(yīng)用。

1、簡單自編碼器

簡單自編碼器

from keras.layers import Input, Dense
from keras.models import Model
from keras.datasets import mnist
import numpy as np
import matplotlib.pyplot as plt
 
(x_train, _), (x_test, _) = mnist.load_data()
 
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))
print(x_train.shape)
print(x_test.shape)
 
encoding_dim = 32
input_img = Input(shape=(784,))
 
encoded = Dense(encoding_dim, activation='relu')(input_img)
decoded = Dense(784, activation='sigmoid')(encoded)
 
autoencoder = Model(inputs=input_img, outputs=decoded)
encoder = Model(inputs=input_img, outputs=encoded)
 
encoded_input = Input(shape=(encoding_dim,))
decoder_layer = autoencoder.layers[-1]
 
decoder = Model(inputs=encoded_input, outputs=decoder_layer(encoded_input))
 
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
 
autoencoder.fit(x_train, x_train, epochs=50, batch_size=256, 
  shuffle=True, validation_data=(x_test, x_test))
 
encoded_imgs = encoder.predict(x_test)
decoded_imgs = decoder.predict(encoded_imgs)
 
n = 10 # how many digits we will display
plt.figure(figsize=(20, 4))
for i in range(n):
 ax = plt.subplot(2, n, i + 1)
 plt.imshow(x_test[i].reshape(28, 28))
 plt.gray()
 ax.get_xaxis().set_visible(False)
 ax.get_yaxis().set_visible(False)
 
 ax = plt.subplot(2, n, i + 1 + n)
 plt.imshow(decoded_imgs[i].reshape(28, 28))
 plt.gray()
 ax.get_xaxis().set_visible(False)
 ax.get_yaxis().set_visible(False)
plt.show()

測試效果：

2、深度自編碼器、稀疏自編碼器

為解決自編碼重構(gòu)損失大的問題，使用多層網(wǎng)絡(luò)搭建自編碼器。對隱層單元施加稀疏性約束的話，會得到更為緊湊的表達(dá)，只有一小部分神經(jīng)元會被激活。在Keras中，我們可以通過添加一個(gè)activity_regularizer達(dá)到對某層激活值進(jìn)行約束的目的

import numpy as np 
np.random.seed(1337) # for reproducibility 
 
from keras.datasets import mnist 
from keras.models import Model #泛型模型 
from keras.layers import Dense, Input 
import matplotlib.pyplot as plt 
 
# X shape (60,000 28x28), y shape (10,000, ) 
(x_train, _), (x_test, y_test) = mnist.load_data() 
 
# 數(shù)據(jù)預(yù)處理 
x_train = x_train.astype('float32') / 255. # minmax_normalized 
x_test = x_test.astype('float32') / 255. # minmax_normalized 
x_train = x_train.reshape((x_train.shape[0], -1)) 
x_test = x_test.reshape((x_test.shape[0], -1)) 
print(x_train.shape) 
print(x_test.shape) 
 
# 壓縮特征維度至2維 
encoding_dim = 2 
 
# this is our input placeholder 
input_img = Input(shape=(784,)) 
 
# 編碼層 
encoded = Dense(128, activation='relu')(input_img) 
encoded = Dense(64, activation='relu')(encoded) 
encoded = Dense(10, activation='relu')(encoded) 
encoder_output = Dense(encoding_dim)(encoded) 
 
# 解碼層 
decoded = Dense(10, activation='relu')(encoder_output) 
decoded = Dense(64, activation='relu')(decoded) 
decoded = Dense(128, activation='relu')(decoded) 
decoded = Dense(784, activation='tanh')(decoded) 
 
# 構(gòu)建自編碼模型 
autoencoder = Model(inputs=input_img, outputs=decoded) 
 
# 構(gòu)建編碼模型 
encoder = Model(inputs=input_img, outputs=encoder_output) 
 
# compile autoencoder 
autoencoder.compile(optimizer='adam', loss='mse') 
 
autoencoder.summary()
encoder.summary()
 
# training 
autoencoder.fit(x_train, x_train, epochs=10, batch_size=256, shuffle=True) 
 
# plotting 
encoded_imgs = encoder.predict(x_test) 
 
plt.scatter(encoded_imgs[:, 0], encoded_imgs[:, 1], c=y_test,s=3) 
plt.colorbar() 
plt.show() 
 
decoded_imgs = autoencoder.predict(x_test)
# use Matplotlib (don't ask)
import matplotlib.pyplot as plt
 
n = 10 # how many digits we will display
plt.figure(figsize=(20, 4))
for i in range(n):
 # display original
 ax = plt.subplot(2, n, i + 1)
 plt.imshow(x_test[i].reshape(28, 28))
 plt.gray()
 ax.get_xaxis().set_visible(False)
 ax.get_yaxis().set_visible(False)
 
 # display reconstruction
 ax = plt.subplot(2, n, i + 1 + n)
 plt.imshow(decoded_imgs[i].reshape(28, 28))
 plt.gray()
 ax.get_xaxis().set_visible(False)
 ax.get_yaxis().set_visible(False)
plt.show()

運(yùn)行結(jié)果：

3、卷積自編碼器

卷積自編碼器的編碼器部分由卷積層和MaxPooling層構(gòu)成，MaxPooling負(fù)責(zé)空域下采樣。而解碼器由卷積層和上采樣層構(gòu)成。

from keras.layers import Input, Dense, Convolution2D, MaxPooling2D, UpSampling2D
from keras.models import Model
from keras.datasets import mnist
import numpy as np
 
(x_train, _), (x_test, _) = mnist.load_data()
 
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
print('---> x_train shape: ', x_train.shape)
x_train = np.reshape(x_train, (len(x_train), 28, 28, 1))
x_test = np.reshape(x_test, (len(x_test), 28, 28, 1))
print('---> xtrain shape: ', x_train.shape)
print('---> x_test shape: ', x_test.shape)
input_img = Input(shape=(28, 28, 1))
 
x = Convolution2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Convolution2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Convolution2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
 
x = Convolution2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Convolution2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Convolution2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Convolution2D(1, (3, 3), activation='sigmoid', padding='same')(x)
 
autoencoder = Model(inputs=input_img, outputs=decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
 
# 打開一個(gè)終端并啟動TensorBoard，終端中輸入 tensorboard --logdir=/autoencoder
autoencoder.fit(x_train, x_train, epochs=10, batch_size=256,
  shuffle=True, validation_data=(x_test, x_test))
 
decoded_imgs = autoencoder.predict(x_test)
import matplotlib.pyplot as plt
decoded_imgs = autoencoder.predict(x_test)
 
n = 10
plt.figure(figsize=(20, 4))
for i in range(1, n+1):
 # display original
 ax = plt.subplot(2, n, i)
 plt.imshow(x_test[i].reshape(28, 28))
 plt.gray()
 ax.get_xaxis().set_visible(False)
 ax.get_yaxis().set_visible(False)
 
 # display reconstruction
 ax = plt.subplot(2, n, i + n)
 plt.imshow(decoded_imgs[i].reshape(28, 28))
 plt.gray()
 ax.get_xaxis().set_visible(False)
 ax.get_yaxis().set_visible(False)
plt.show()

訓(xùn)練結(jié)果展示：