快捷導(dǎo)航

Tensorflow 2.4 搭建單層和多層 Bi-LSTM 模型

更新時(shí)間：2023年01月06日 17:02:23 作者：我是王大你是誰(shuí)

這篇文章主要為大家介紹了Tensorflow 2.4 搭建單層 Bi-LSTM 模型和多層 Bi-LSTM 模型的實(shí)現(xiàn)過(guò)程詳解，有需要的朋友可以借鑒參考下，希望能夠有所幫助，祝大家多多進(jìn)步，早日升職加薪

前言

本文使用 cpu 版本的 TensorFlow 2.4 ，分別搭建單層 Bi-LSTM 模型和多層 Bi-LSTM 模型完成文本分類任務(wù)。

確保使用 numpy == 1.19.0 左右的版本，否則在調(diào)用 TextVectorization 的時(shí)候可能會(huì)報(bào) NotImplementedError 。

實(shí)現(xiàn)過(guò)程

1. 獲取數(shù)據(jù)

（1）我們本文用到的數(shù)據(jù)是電影的影評(píng)數(shù)據(jù)，每個(gè)樣本包含了一個(gè)對(duì)電影的評(píng)論文本和一個(gè)情感標(biāo)簽，1 表示積極評(píng)論，0 表示負(fù)面評(píng)論，也就是說(shuō)這是一份二分類的數(shù)據(jù)。

（2）我們通過(guò) TensorFlow 內(nèi)置的函數(shù)，可以從網(wǎng)絡(luò)上直接下載 imdb_reviews 數(shù)據(jù)到本地的磁盤，并取出訓(xùn)練數(shù)據(jù)和測(cè)試數(shù)據(jù)。

（3）通過(guò)使用 tf.data.Dataset 相關(guān)的處理函數(shù)，我們將訓(xùn)練數(shù)據(jù)和測(cè)試數(shù)據(jù)分別進(jìn)行混洗，并且設(shè)置每個(gè) batch 大小都是 64 ，每個(gè)樣本都是 (text, label) 的形式。如下我們?nèi)×巳我庖粋€(gè) batch 中的前兩個(gè)影評(píng)文本和情感標(biāo)簽。

import numpy as np
import tensorflow_datasets as tfds
import tensorflow as tf
import matplotlib.pyplot as plt
tfds.disable_progress_bar()
BUFFER_SIZE = 10000
BATCH_SIZE = 64
dataset, info = tfds.load('imdb_reviews', with_info=True, as_supervised=True)
train_dataset, test_dataset = dataset['train'], dataset['test']
train_dataset = train_dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)
test_dataset = test_dataset.batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)
for example, label in train_dataset.take(1):
    print('text: ', example.numpy()[:2])
    print('label: ', label.numpy()[:2])

部分樣本顯示：

text: [
b"First of all, I have to say I have worked for blockbuster and have seen quite a few movies to the point its tough for me to find something I haven't seen. Taking this into account, I want everyone to know that this movie was by far the worst film ever made, it made me pine for Gigli, My Boss's Daughter, and any other piece of junk you've ever seen. BeLyt must be out of his mind, I've only found one person who liked it and even they couldn't tell me what the movie was about. If you are able to decipher this movie and are able to tell me what it was about you have to either be the writer or a fortune teller because there's any other way a person could figure this crap out.<br /><br />FOR THE LOVE OF G-D STAY AWAY!"
b"Just got out and cannot believe what a brilliant documentary this is. Rarely do you walk out of a movie theater in such awe and amazement. Lately movies have become so over hyped that the thrill of discovering something truly special and unique rarely happens. Amores Perros did this to me when it first came out and this movie is doing to me now. I didn't know a thing about this before going into it and what a surprise. If you hear the concept you might get the feeling that this is one of those touchy movies about an amazing triumph covered with over the top music and trying to have us fully convinced of what a great story it is telling but then not letting us in. Fortunetly this is not that movie. The people tell the story! This does such a good job of capturing every moment of their involvement while we enter their world and feel every second with them. There is so much beyond the climb that makes everything they go through so much more tense. Touching the Void was also a great doc about mountain climbing and showing the intensity in an engaging way but this film is much more of a human story. I just saw it today but I will go and say that this is one of the best documentaries I have ever seen."
]
label: [0 1]

2. 處理數(shù)據(jù)

（1）想要在模型中訓(xùn)練這些數(shù)據(jù)，必須將這些文本中的 token 都轉(zhuǎn)換成機(jī)器可以識(shí)別的整數(shù)，最簡(jiǎn)單的方法就是使用 TextVectorization 來(lái)制作一個(gè)編碼器 encoder，這里只將出現(xiàn)次數(shù)最多的 1000 個(gè) token 當(dāng)做詞表，另外規(guī)定每個(gè)影評(píng)處理之后只能保留最長(zhǎng) 200 的長(zhǎng)度，如果超過(guò)則會(huì)被截?cái)?，如果不足則用填充字符對(duì)應(yīng)的整數(shù) 0 補(bǔ)齊。

（2）這里展現(xiàn)出來(lái)了某個(gè)樣本的經(jīng)過(guò)整數(shù)映射止之后的結(jié)果，可以看到影評(píng)對(duì)應(yīng)的整數(shù)數(shù)組長(zhǎng)度為 200 。

MAX_SEQ_LENGTH = 200
VOCAB_SIZE = 1000
encoder = tf.keras.layers.experimental.preprocessing.TextVectorization(max_tokens=VOCAB_SIZE, output_sequence_length=MAX_SEQ_LENGTH)
encoder.adapt(train_dataset.map(lambda text, label: text))
vocab = np.array(encoder.get_vocabulary())
encoded_example = encoder(example)[:1].numpy()
print(encoded_example)
print(label[:1])

隨機(jī)選取一個(gè)樣本進(jìn)行證書映射結(jié)果：

[[ 86 5 32 10 26 6 130 10 26 926 16 1 3 26 108 176 4 164
93 6 2 215 30 1 16 70 6 160 140 10 731 108 647 11 78 1
10 178 305 6 118 12 11 18 14 33 234 2 240 20 122 91 9 91
70 1 16 1 56 1 580 3 99 81 408 5 1 825 122 108 1 217
28 46 5 25 349 195 61 249 29 409 37 405 9 3 54 35 404 360
70 49 2 18 14 43 45 23 24 491 6 1 11 18 3 24 491 6
360 70 49 9 14 43 23 26 6 352 28 2 762 42 4 1 1 80
213 99 81 97 4 409 96 811 11 638 1 13 16 2 116 5 1 766
242 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0]]
tf.Tensor([0], shape=(1,), dtype=int64)

3. 單層 Bi-LSTM 模型

（1）第一層是我們剛才定義好的 encoder ，將輸入的文本進(jìn)行整數(shù)的映射。

（2）第二層是 Embedding 層，我們這里設(shè)置了每個(gè)詞的詞嵌入維度為 32 維。

（3）第三層是 Bi-LSTM 層，這里我們?cè)O(shè)置了每個(gè) LSTM 單元的輸出維度為 16 維。

（4）第四層是一個(gè)輸出 8 維向量的全連接層，并且使用的 relu 激活函數(shù)。

（5）第五層是 Dropout ，設(shè)置神經(jīng)元丟棄率為 0.5 ，主要是為了防止過(guò)擬合。

（6）第六層是一個(gè)輸出 1 維向量的全連接層，也就是輸出層，表示的是該樣本的 logit 。

model = tf.keras.Sequential([
    encoder,
    tf.keras.layers.Embedding( input_dim=len(encoder.get_vocabulary()), output_dim=32, mask_zero=True),
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(16)),
    tf.keras.layers.Dense(8, activation='relu'),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(1)
])

（7）在沒(méi)有經(jīng)過(guò)訓(xùn)練的模型上對(duì)文本進(jìn)行預(yù)測(cè)，如果輸出小于 0 則為消極評(píng)論，如果大于 0 則為積極評(píng)論，我們可以看出這條評(píng)論本來(lái)應(yīng)該是積極評(píng)論，但是卻輸出的 logit 卻是負(fù)數(shù)，即錯(cuò)誤預(yù)測(cè)成了消極的。

sample_text = ('The movie was cool. The animation and the graphics were out of this world. I would recommend this movie.')
model.predict(np.array([sample_text]))

預(yù)測(cè)結(jié)果為：

array([[-0.01437075]], dtype=float32)

array([[-0.01437075]], dtype=float32)

（8）我們使用 BinaryCrossentropy 作為損失函數(shù)，需要注意的是如果模型輸出結(jié)果給到 BinaryCrossentropy 的是一個(gè) logit 值（值域范圍 [-∞, +∞] ），則應(yīng)該設(shè)置 from_logits=True 。如果模型輸出結(jié)果給到 BinaryCrossentropy 的是一個(gè)概率值 probability （值域范圍 [0, 1] ），則應(yīng)該設(shè)置為 from_logits=False 。

（9）我們使用 Adam 作為優(yōu)化器，并且設(shè)置學(xué)習(xí)率為 1e-3 。

（10）我們使用準(zhǔn)確率 accuracy 作為評(píng)估指標(biāo)。

（11）使用訓(xùn)練數(shù)據(jù)訓(xùn)練 10 個(gè) epoch，同時(shí)每經(jīng)過(guò)一個(gè) epoch 使用驗(yàn)證數(shù)據(jù)對(duì)模型進(jìn)行評(píng)估。

model.compile(loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
              optimizer=tf.keras.optimizers.Adam(1e-3),
              metrics=['accuracy'])
history = model.fit(train_dataset, epochs=10,  validation_data=test_dataset, validation_steps=30)

訓(xùn)練過(guò)程如下：

Epoch 1/10
391/391 [==============================] - 30s 65ms/step - loss: 0.6461 - accuracy: 0.5090 - val_loss: 0.4443 - val_accuracy: 0.8245
Epoch 2/10
391/391 [==============================] - 23s 58ms/step - loss: 0.4594 - accuracy: 0.6596 - val_loss: 0.3843 - val_accuracy: 0.8396
...
Epoch 10/10
391/391 [==============================] - 22s 57ms/step - loss: 0.3450 - accuracy: 0.8681 - val_loss: 0.3920 - val_accuracy: 0.8417

（12）訓(xùn)練結(jié)束之后使用測(cè)試數(shù)據(jù)對(duì)模型進(jìn)行測(cè)試，準(zhǔn)確率可以達(dá)到 0.8319 。如果經(jīng)過(guò)超參數(shù)的調(diào)整和足夠的訓(xùn)練時(shí)間，效果會(huì)更好。

model.evaluate(test_dataset)

結(jié)果為：

391/391 [==============================] - 6s 15ms/step - loss: 0.3964 - accuracy: 0.8319

（13）使用訓(xùn)練好的模型對(duì)影評(píng)進(jìn)行分類預(yù)測(cè)，可以看出可以正確得識(shí)別文本的情感取向。因?yàn)樨?fù)數(shù)表示的就是影評(píng)為負(fù)面情緒的。

sample_text = ('The movie was not cool. The animation and the graphics were bad. I would not recommend this movie.')
model.predict(np.array([sample_text]))

結(jié)果為：

array([[-1.6402857]], dtype=float32)

4. 多層 Bi-LSTM 模型

（1）我們上面只是搭建了一層的 Bi-LSTM ，這里我們搭建了兩層的 Bi-LSTM 模型，也就是在第二層 Bidirectional 之后又加了一層 Bidirectional ，這樣可以使我們的模型更加有效。我們使用的損失函數(shù)、優(yōu)化器、評(píng)估指標(biāo)都和上面一樣。

model = tf.keras.Sequential([
    encoder,
    tf.keras.layers.Embedding(len(encoder.get_vocabulary()), 32, mask_zero=True),
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(32,  return_sequences=True)),
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(16)),
    tf.keras.layers.Dense(8, activation='relu'),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(1)
])
model.compile(loss=tf.keras.losses.BinaryCrossentropy(from_logits=True), optimizer=tf.keras.optimizers.Adam(1e-3), metrics=['accuracy'])
history = model.fit(train_dataset, epochs=10, validation_data=test_dataset,  validation_steps=30)

訓(xùn)練過(guò)程如下：

Epoch 1/10
391/391 [==============================] - 59s 124ms/step - loss: 0.6170 - accuracy: 0.5770 - val_loss: 0.3931 - val_accuracy: 0.8135
Epoch 2/10
391/391 [==============================] - 45s 114ms/step - loss: 0.4264 - accuracy: 0.7544 - val_loss: 0.3737 - val_accuracy: 0.8380
...
Epoch 10/10
391/391 [==============================] - 45s 114ms/step - loss: 0.3138 - accuracy: 0.8849 - val_loss: 0.4069 - val_accuracy: 0.8323

（2）訓(xùn)練結(jié)束之后使用測(cè)試數(shù)據(jù)對(duì)模型進(jìn)行測(cè)試，準(zhǔn)確率可以達(dá)到 0.8217 。如果經(jīng)過(guò)超參數(shù)的調(diào)整和足夠的訓(xùn)練時(shí)間，效果會(huì)更好。

model.evaluate(test_dataset)

結(jié)果為：

391/391 [==============================] - 14s 35ms/step - loss: 0.4021 - accuracy: 0.8217

（3）使用訓(xùn)練好的模型對(duì)影評(píng)進(jìn)行分類預(yù)測(cè)，可以看出可以正確得識(shí)別文本的情感取向。因?yàn)檎龜?shù)表示的就是影評(píng)為積極情緒的。

sample_text = ('The movie was good. The animation and the graphics were very good. you should love movie.')
model.predict(np.array([sample_text]))

結(jié)果為：

array([[3.571126]], dtype=float32)

以上就是Tensorflow 2.4 搭建單層和多層 Bi-LSTM 模型的詳細(xì)內(nèi)容，更多關(guān)于Tensorflow搭建Bi-LSTM模型的資料請(qǐng)關(guān)注腳本之家其它相關(guān)文章！

您可能感興趣的文章:

欧美bbbwbbbw肥妇,免费乱码人妻系列日韩,一级黄片

Tensorflow 2.4 搭建單層和多層 Bi-LSTM 模型

目錄

前言

實(shí)現(xiàn)過(guò)程

1. 獲取數(shù)據(jù)

2. 處理數(shù)據(jù)

3. 單層 Bi-LSTM 模型

4. 多層 Bi-LSTM 模型

相關(guān)文章

最新評(píng)論

大家感興趣的內(nèi)容

最近更新的內(nèi)容

常用在線小工具