深度學(xué)習(xí)的MNIST手寫數(shù)字?jǐn)?shù)據(jù)集識(shí)別方式(準(zhǔn)確率99%,附代碼)

更新時(shí)間：2024年06月25日 16:52:47 作者：什么都不太會(huì)的研究生

這篇文章主要介紹了深度學(xué)習(xí)的MNIST手寫數(shù)字?jǐn)?shù)據(jù)集識(shí)別方式(準(zhǔn)確率99%,附代碼),具有很好的參考價(jià)值,希望對大家有所幫助,如有錯(cuò)誤或未考慮完全的地方,望不吝賜教

1.Mnist數(shù)據(jù)集介紹

1.1 基本介紹

Mnist數(shù)據(jù)集可以算是學(xué)習(xí)深度學(xué)習(xí)最常用到的了。

這個(gè)數(shù)據(jù)集包含70000張手寫數(shù)字圖片，分別是60000張訓(xùn)練圖片和10000張測試圖片，訓(xùn)練集由來自250個(gè)不同人手寫的數(shù)字構(gòu)成，一般來自高中生，一半來自工作人員，測試集（test set）也是同樣比例的手寫數(shù)字?jǐn)?shù)據(jù)，并且保證了測試集和訓(xùn)練集的作者不同。

每個(gè)圖片都是2828個(gè)像素點(diǎn)，數(shù)據(jù)集/會(huì)把一張圖片的數(shù)據(jù)轉(zhuǎn)成一個(gè)2828=784的一維向量存儲(chǔ)起來。

里面的圖片數(shù)據(jù)如下所示，每張圖是0-9的手寫數(shù)字黑底白字的圖片，存儲(chǔ)時(shí)，黑色用0表示，白色用0-1的浮點(diǎn)數(shù)表示。

在這里插入圖片描述

1.2 數(shù)據(jù)集下載

1）官網(wǎng)下載

Mnist數(shù)據(jù)集的下載地址如下：http://yann.lecun.com/exdb/mnist/

打開后會(huì)有四個(gè)文件：

在這里插入圖片描述

訓(xùn)練數(shù)據(jù)集：train-images-idx3-ubyte.gz
訓(xùn)練數(shù)據(jù)集標(biāo)簽：train-labels-idx1-ubyte.gz
測試數(shù)據(jù)集：t10k-images-idx3-ubyte.gz
測試數(shù)據(jù)集標(biāo)簽：t10k-labels-idx1-ubyte.gz

將這四個(gè)文件下載后放置到需要用的文件夾下即可不要解壓！下載后是什么就怎么放！

2）代碼導(dǎo)入

文件夾下運(yùn)行下面的代碼，即可自動(dòng)檢測數(shù)據(jù)集是否存在，若沒有會(huì)自動(dòng)進(jìn)行下載，下載后在這一路徑：

在這里插入圖片描述

# 下載數(shù)據(jù)集
from torchvision import datasets, transforms

train_set = datasets.MNIST("data",train=True,download=True, transform=transforms.ToTensor(),)
test_set = datasets.MNIST("data",train=False,download=True, transform=transforms.ToTensor(),)

參數(shù)解釋：

datasets.MNIST：是Pytorch的內(nèi)置函數(shù)torchvision.datasets.MNIST，可以導(dǎo)入數(shù)據(jù)集
train=True ：讀入的數(shù)據(jù)作為訓(xùn)練集
transform：讀入我們自己定義的數(shù)據(jù)預(yù)處理操作
download=True：當(dāng)我們的根目錄（root）下沒有數(shù)據(jù)集時(shí)，便自動(dòng)下載

如果這時(shí)候我們通過聯(lián)網(wǎng)自動(dòng)下載方式download我們的數(shù)據(jù)后，它的文件路徑是以下形式：原文件夾/data/MNIST/raw

2.代碼部分

2.1文件夾目錄

在這里插入圖片描述

test：自己寫的測試圖片
main:主函數(shù)
model:訓(xùn)練的模型參數(shù)，會(huì)自動(dòng)生成
data:數(shù)據(jù)集文件夾 2.2 運(yùn)行結(jié)果

14輪左右，模型識(shí)別準(zhǔn)確率達(dá)到99%以上

在這里插入圖片描述

2.3代碼

1）導(dǎo)入必要的包及預(yù)處理

本人學(xué)習(xí)時(shí)做了較多注釋，且用的是下載好的文件，如果是自己的請更改對應(yīng)的文件目錄哦。

import os
import matplotlib.pyplot as plt
import torch
from PIL import Image
from torch import nn
from torch.nn import Conv2d, Linear, ReLU
from torch.nn import MaxPool2d
from torchvision import transforms
from torchvision.datasets import MNIST
from torch.utils.data import DataLoader


# Dataset:創(chuàng)建數(shù)據(jù)集的函數(shù)；__init__:初始化數(shù)據(jù)內(nèi)容和標(biāo)簽
# __geyitem:獲取數(shù)據(jù)內(nèi)容和標(biāo)簽
# __len__:獲取數(shù)據(jù)集大小
# daataloader:數(shù)據(jù)加載類，接受來自dataset已經(jīng)加載好的數(shù)據(jù)集
# torchbision:圖形庫，包含預(yù)訓(xùn)練模型，加載數(shù)據(jù)的函數(shù)、圖片變換，裁剪、旋轉(zhuǎn)等
# torchtext:處理文本的工具包，將不同類型的額文件轉(zhuǎn)換為datasets

# 預(yù)處理：將兩個(gè)步驟整合在一起
transform = transforms.Compose({
    transforms.ToTensor(),  # 將灰度圖片像素值（0~255）轉(zhuǎn)為Tensor（0~1），方便后續(xù)處理
    # transforms.Normalize((0.1307,),(0.3081)),    # 歸一化，均值0，方差1;mean:各通道的均值std：各通道的標(biāo)準(zhǔn)差inplace：是否原地操作
})

2）加載數(shù)據(jù)集

# 加載數(shù)據(jù)集
# 訓(xùn)練數(shù)據(jù)集
train_data = MNIST(root='./data', train=True, transform=transform, download=True)
train_loader = DataLoader(dataset=train_data, batch_size=64, shuffle=True)
# transform：指示加載的數(shù)據(jù)集應(yīng)用的數(shù)據(jù)預(yù)處理的規(guī)則，shuffle：洗牌，是否打亂輸入數(shù)據(jù)順序
# 測試數(shù)據(jù)集
test_data = MNIST(root="./data", train=False, transform=transform, download=True)
test_loader = DataLoader(dataset=test_data, batch_size=64, shuffle=True)

train_data_size = len(train_data)
test_data_size = len(test_data)
print("訓(xùn)練數(shù)據(jù)集的長度：{}".format(train_data_size))
print("測試數(shù)據(jù)集的長度：{}".format(test_data_size))

3）構(gòu)建模型

成功運(yùn)行的話請給個(gè)免費(fèi)的贊吧！（調(diào)試不易）

模型主要由兩個(gè)卷積層，兩個(gè)池化層，以及三個(gè)全連接層構(gòu)成，激活函數(shù)使用relu.

class MnistModel(nn.Module):
    def __init__(self):
        super(MnistModel, self).__init__()
        self.conv1 = Conv2d(in_channels=1, out_channels=10, kernel_size=5, stride=1, padding=0)
        self.maxpool1 = MaxPool2d(2)
        self.conv2 = Conv2d(in_channels=10, out_channels=20, kernel_size=5, stride=1, padding=0)
        self.maxpool2 = MaxPool2d(2)
        self.linear1 = Linear(320, 128)
        self.linear2 = Linear(128, 64)
        self.linear3 = Linear(64, 10)
        self.relu = ReLU()

    def forward(self, x):
        x = self.relu(self.maxpool1(self.conv1(x)))
        x = self.relu(self.maxpool2(self.conv2(x)))
        x = x.view(x.size(0), -1)
        x = self.linear1(x)
        x = self.linear2(x)
        x = self.linear3(x)

        return x

# 損失函數(shù)CrossentropyLoss
model = MnistModel()#實(shí)例化
criterion = nn.CrossEntropyLoss()   # 交叉熵?fù)p失，相當(dāng)于Softmax+Log+NllLoss
# 線性多分類模型Softmax,給出最終預(yù)測值對于10個(gè)類別出現(xiàn)的概率，Log:將乘法轉(zhuǎn)換為加法，減少計(jì)算量，保證函數(shù)的單調(diào)性
# NLLLoss:計(jì)算損失，此過程不需要手動(dòng)one-hot編碼，NLLLoss會(huì)自動(dòng)完成

# SGD，優(yōu)化器，梯度下降算法e
optimizer = torch.optim.SGD(model.parameters(), lr=0.14)#lr:學(xué)習(xí)率

4）模型訓(xùn)練

每次訓(xùn)練完成后會(huì)自動(dòng)保存參數(shù)到pkl模型中，如果路徑中有Pkl文件，下次運(yùn)行會(huì)自動(dòng)加載上一次的模型參數(shù)，在這個(gè)基礎(chǔ)上繼續(xù)訓(xùn)練，第一次運(yùn)行時(shí)沒有模型參數(shù)，結(jié)束后會(huì)自動(dòng)生成。

# 模型訓(xùn)練
def train():
    # index = 0
    for index, data in enumerate(train_loader):#獲取訓(xùn)練數(shù)據(jù)以及對應(yīng)標(biāo)簽
        # for data in train_loader:
       input, target = data   # input為輸入數(shù)據(jù)，target為標(biāo)簽
       y_predict = model(input) #模型預(yù)測
       loss = criterion(y_predict, target)
       optimizer.zero_grad() #梯度清零
       loss.backward()#loss值反向傳播
       optimizer.step()#更新參數(shù)
       # index += 1
       if index % 100 == 0: # 每一百次保存一次模型，打印損失
           torch.save(model.state_dict(), "./model/model.pkl")   # 保存模型
           torch.save(optimizer.state_dict(), "./model/optimizer.pkl")
           print("訓(xùn)練次數(shù)為：{}，損失值為：{}".format(index, loss.item() ))

5）加載模型

第一次運(yùn)行這里需要一個(gè)空的model文件夾

# 加載模型
if os.path.exists('./model/model.pkl'):
   model.load_state_dict(torch.load("./model/model.pkl"))#加載保存模型的參數(shù)

6）模型測試

# 模型測試
def test():
    correct = 0     # 正確預(yù)測的個(gè)數(shù)
    total = 0   # 總數(shù)
    with torch.no_grad():   # 測試不用計(jì)算梯度
        for data in test_loader:
            input, target = data
            output = model(input)   # output輸出10個(gè)預(yù)測取值，概率最大的為預(yù)測數(shù)
            probability, predict = torch.max(input=output.data, dim=1)    # 返回一個(gè)元祖，第一個(gè)為最大概率值，第二個(gè)為最大概率值的下標(biāo)
            # loss = criterion(output, target)
            total += target.size(0)  # target是形狀為（batch_size,1)的矩陣，使用size（0）取出該批的大小
            correct += (predict == target).sum().item()  # predict 和target均為（batch_size,1)的矩陣，sum求出相等的個(gè)數(shù)
        print("測試準(zhǔn)確率為：%.6f" %(correct / total))

7）自己手寫數(shù)字圖片識(shí)別函數(shù)（可選用）

這部分主要是加載訓(xùn)練好的pkl模型測試自己的數(shù)據(jù)，因此在進(jìn)行自己手寫圖的測試時(shí)，需要有訓(xùn)練好的pkl文件，并且就不要調(diào)用train()函數(shù)和test()函數(shù)啦

注意：這個(gè)圖片像素也要說黑底白字，28*28像素，否則無法識(shí)別

def test_mydata():
    image = Image.open('./test/test_two.png')   #讀取自定義手寫圖片
    image = image.resize((28, 28))   # 裁剪尺寸為28*28
    image = image.convert('L')  # 轉(zhuǎn)換為灰度圖像
    transform = transforms.ToTensor()
    image = transform(image)
    image = image.resize(1, 1, 28, 28)
    output = model(image)
    probability, predict = torch.max(output.data, dim=1)
    print("此手寫圖片值為：%d,其最大概率為：%.2f " % (predict[0], probability))
    plt.title("此手寫圖片值為：{}".format((int(predict))), fontname='SimHei')
    plt.imshow(image.squeeze())
    plt.show()

8）MNIST中的數(shù)據(jù)識(shí)別測試數(shù)據(jù)

訓(xùn)練過程中的打印信息我進(jìn)行了修改，這里設(shè)置的訓(xùn)練輪數(shù)是15輪，每次訓(xùn)練生成的pkl模型參數(shù)也是會(huì)更新的，想要更多訓(xùn)練信息可以查看對應(yīng)的教程哦~

#測試識(shí)別函數(shù)
if __name__ == '__main__':
    #訓(xùn)練與測試
    for i in range(15):#訓(xùn)練和測試進(jìn)行15輪
        print({"————————第{}輪測試開始——————".format (i + 1)})
        train()
        test(）

9）測試自己的手寫數(shù)字圖片（可選）

這部分主要是與tset_mydata()函數(shù)結(jié)合，加載訓(xùn)練好的pkl模型測試自己的數(shù)據(jù)，因此在進(jìn)行自己手寫圖的測試時(shí)，需要有訓(xùn)練好的pkl文件，并且就不要調(diào)用train()函數(shù)和test()函數(shù)啦。

注意：這個(gè)圖片像素也要說黑底白字，28*28像素，否則無法識(shí)別

# 測試主函數(shù)
if __name__ == '__main__':
    test_mydata()

將所有代碼按順序放到編輯器中，安裝好對應(yīng)的包，就可以順利運(yùn)行啦。

完整代碼放下面：

import os
import matplotlib.pyplot as plt
import torch
from PIL import Image
from torch import nn
from torch.nn import Conv2d, Linear, ReLU
from torch.nn import MaxPool2d
from torchvision import transforms
from torchvision.datasets import MNIST
from torch.utils.data import DataLoader


# Dataset:創(chuàng)建數(shù)據(jù)集的函數(shù)；__init__:初始化數(shù)據(jù)內(nèi)容和標(biāo)簽
# __geyitem:獲取數(shù)據(jù)內(nèi)容和標(biāo)簽
# __len__:獲取數(shù)據(jù)集大小
# daataloader:數(shù)據(jù)加載類，接受來自dataset已經(jīng)加載好的數(shù)據(jù)集
# torchbision:圖形庫，包含預(yù)訓(xùn)練模型，加載數(shù)據(jù)的函數(shù)、圖片變換，裁剪、旋轉(zhuǎn)等
# torchtext:處理文本的工具包，將不同類型的額文件轉(zhuǎn)換為datasets

# 預(yù)處理：將兩個(gè)步驟整合在一起
transform = transforms.Compose({
    transforms.ToTensor(),  # 將灰度圖片像素值（0~255）轉(zhuǎn)為Tensor（0~1），方便后續(xù)處理
    # transforms.Normalize((0.1307,),(0.3081)),    # 歸一化，均值0，方差1;mean:各通道的均值std：各通道的標(biāo)準(zhǔn)差inplace：是否原地操作
})

# normalize執(zhí)行以下操作：image=(image-mean)/std?????
# input[channel] = (input[channel] - mean[channel]) / std[channel]

# 加載數(shù)據(jù)集
# 訓(xùn)練數(shù)據(jù)集
train_data = MNIST(root='./data', train=True, transform=transform, download=True)
train_loader = DataLoader(dataset=train_data, batch_size=64, shuffle=True)
# transform：指示加載的數(shù)據(jù)集應(yīng)用的數(shù)據(jù)預(yù)處理的規(guī)則，shuffle：洗牌，是否打亂輸入數(shù)據(jù)順序
# 測試數(shù)據(jù)集
test_data = MNIST(root="./data", train=False, transform=transform, download=True)
test_loader = DataLoader(dataset=test_data, batch_size=64, shuffle=True)

train_data_size = len(train_data)
test_data_size = len(test_data)
print("訓(xùn)練數(shù)據(jù)集的長度：{}".format(train_data_size))
print("測試數(shù)據(jù)集的長度：{}".format(test_data_size))
# print(test_data)
# print(train_data)


class MnistModel(nn.Module):
    def __init__(self):
        super(MnistModel, self).__init__()
        self.conv1 = Conv2d(in_channels=1, out_channels=10, kernel_size=5, stride=1, padding=0)
        self.maxpool1 = MaxPool2d(2)
        self.conv2 = Conv2d(in_channels=10, out_channels=20, kernel_size=5, stride=1, padding=0)
        self.maxpool2 = MaxPool2d(2)
        self.linear1 = Linear(320, 128)
        self.linear2 = Linear(128, 64)
        self.linear3 = Linear(64, 10)
        self.relu = ReLU()

    def forward(self, x):
        x = self.relu(self.maxpool1(self.conv1(x)))
        x = self.relu(self.maxpool2(self.conv2(x)))
        x = x.view(x.size(0), -1)
        x = self.linear1(x)
        x = self.linear2(x)
        x = self.linear3(x)

        return x


# 損失函數(shù)CrossentropyLoss
model = MnistModel()#實(shí)例化
criterion = nn.CrossEntropyLoss()   # 交叉熵?fù)p失，相當(dāng)于Softmax+Log+NllLoss
# 線性多分類模型Softmax,給出最終預(yù)測值對于10個(gè)類別出現(xiàn)的概率，Log:將乘法轉(zhuǎn)換為加法，減少計(jì)算量，保證函數(shù)的單調(diào)性
# NLLLoss:計(jì)算損失，此過程不需要手動(dòng)one-hot編碼，NLLLoss會(huì)自動(dòng)完成

# SGD，優(yōu)化器，梯度下降算法e
optimizer = torch.optim.SGD(model.parameters(), lr=0.14)#lr:學(xué)習(xí)率


# 模型訓(xùn)練
def train():
    # index = 0
    for index, data in enumerate(train_loader):#獲取訓(xùn)練數(shù)據(jù)以及對應(yīng)標(biāo)簽
        # for data in train_loader:
       input, target = data   # input為輸入數(shù)據(jù)，target為標(biāo)簽
       y_predict = model(input) #模型預(yù)測
       loss = criterion(y_predict, target)
       optimizer.zero_grad() #梯度清零
       loss.backward()#loss值反向傳播
       optimizer.step()#更新參數(shù)
       # index += 1
       if index % 100 == 0: # 每一百次保存一次模型，打印損失
           torch.save(model.state_dict(), "./model/model.pkl")   # 保存模型
           torch.save(optimizer.state_dict(), "./model/optimizer.pkl")
           print("訓(xùn)練次數(shù)為：{}，損失值為：{}".format(index, loss.item() ))

# 加載模型
if os.path.exists('./model/model.pkl'):
   model.load_state_dict(torch.load("./model/model.pkl"))#加載保存模型的參數(shù)


# 模型測試
def test():
    correct = 0     # 正確預(yù)測的個(gè)數(shù)
    total = 0   # 總數(shù)
    with torch.no_grad():   # 測試不用計(jì)算梯度
        for data in test_loader:
            input, target = data
            output = model(input)   # output輸出10個(gè)預(yù)測取值，概率最大的為預(yù)測數(shù)
            probability, predict = torch.max(input=output.data, dim=1)    # 返回一個(gè)元祖，第一個(gè)為最大概率值，第二個(gè)為最大概率值的下標(biāo)
            # loss = criterion(output, target)
            total += target.size(0)  # target是形狀為（batch_size,1)的矩陣，使用size（0）取出該批的大小
            correct += (predict == target).sum().item()  # predict 和target均為（batch_size,1)的矩陣，sum求出相等的個(gè)數(shù)
        print("測試準(zhǔn)確率為：%.6f" %(correct / total))


#測試識(shí)別函數(shù)
if __name__ == '__main__':
    #訓(xùn)練與測試
    for i in range(15):#訓(xùn)練和測試進(jìn)行5輪
        print({"————————第{}輪測試開始——————".format (i + 1)})
        train()
        test()


def test_mydata():
    image = Image.open('./test/test_two.png')   #讀取自定義手寫圖片
    image = image.resize((28, 28))   # 裁剪尺寸為28*28
    image = image.convert('L')  # 轉(zhuǎn)換為灰度圖像
    transform = transforms.ToTensor()
    image = transform(image)
    image = image.resize(1, 1, 28, 28)
    output = model(image)
    probability, predict = torch.max(output.data, dim=1)
    print("此手寫圖片值為：%d,其最大概率為：%.2f " % (predict[0], probability))
    plt.title("此手寫圖片值為：{}".format((int(predict))), fontname='SimHei')
    plt.imshow(image.squeeze())
    plt.show()

# 測試主函數(shù)
# if __name__ == '__main__':
#     test_mydata()