因?yàn)槲覀兇蜷_我們的的學(xué)習(xí)數(shù)據(jù)集，最后一項(xiàng)是我們的真實(shí)數(shù)值，看過小唐上一篇的人都知道，老規(guī)矩先進(jìn)行拆分，前面的特征放一塊，后面的真實(shí)值放一塊，同時(shí)由于數(shù)據(jù)沒有列名，我們選擇使用iloc[]來實(shí)現(xiàn)分離

def shuju(tr_path,ts_path,sep='\t'):
    train=pd.read_csv(tr_path,sep=sep)
    test=pd.read_csv(ts_path,sep=sep)
    #特征和結(jié)果分離
    train_features=train.iloc[:,:-1].values
    train_labels=train.iloc[:,-1].values
    test_features = test.iloc[:, :-1].values
    test_labels = test.iloc[:, -1].values
    return train_features,test_features,train_labels,test_labels

訓(xùn)練數(shù)據(jù)

我們在這里直接使用sklearn函數(shù)，通過選擇模型，然后直接生成其識別規(guī)則

#訓(xùn)練數(shù)據(jù)
def train_tree(*data):
    x_train, x_test, y_train, y_test=data
    clf=DecisionTreeClassifier()
    clf.fit(x_train,y_train)
    print("學(xué)習(xí)模型預(yù)測成績：{:.4f}".format(clf.score(x_train, y_train)))
    print("實(shí)際模型預(yù)測成績：{:.4f}".format(clf.score(x_test, y_test)))
    #返回學(xué)習(xí)模型
    return clf

數(shù)據(jù)可視化

為了讓我們的觀察更加直觀，我們還可以使用matplotlib來進(jìn)行觀測

def plot_imafe(test,test_labels,preds):
    plt.ion()
    plt.show()
    for i in range(50):
        label,pred=test_labels[i],preds[i]
        title='實(shí)際值:{},predict{}'.format(label,pred)
        img=test[i].reshape(28,28)
        plt.imshow(img,cmap="binary")
        plt.title(title)
        plt.show()
    print('done')

結(jié)果

完整代碼

import pandas as pd
from sklearn.tree import DecisionTreeClassifier
import matplotlib.pyplot as plt

def shuju(tr_path,ts_path,sep='\t'):
    train=pd.read_csv(tr_path,sep=sep)
    test=pd.read_csv(ts_path,sep=sep)
    #特征和結(jié)果分離
    train_features=train.iloc[:,:-1].values
    train_labels=train.iloc[:,-1].values
    test_features = test.iloc[:, :-1].values
    test_labels = test.iloc[:, -1].values
    return train_features,test_features,train_labels,test_labels
#訓(xùn)練數(shù)據(jù)
def train_tree(*data):
    x_train, x_test, y_train, y_test=data
    clf=DecisionTreeClassifier()
    clf.fit(x_train,y_train)
    print("學(xué)習(xí)模型預(yù)測成績：{:.4f}".format(clf.score(x_train, y_train)))
    print("實(shí)際模型預(yù)測成績：{:.4f}".format(clf.score(x_test, y_test)))
    #返回學(xué)習(xí)模型
    return clf

def plot_imafe(test,test_labels,preds):
    plt.ion()
    plt.show()
    for i in range(50):
        label,pred=test_labels[i],preds[i]
        title='實(shí)際值:{},predict{}'.format(label,pred)
        img=test[i].reshape(28,28)
        plt.imshow(img,cmap="binary")
        plt.title(title)
        plt.show()
    print('done')

train_features,test_features,train_labels,test_labels=shuju(r"C:\Users\twy\PycharmProjects\1\train_images.csv",r"C:\Users\twy\PycharmProjects\1\test_images.csv")
clf=train_tree(train_features,test_features,train_labels,test_labels)
preds=clf.predict(test_features)
plot_imafe(test_features,test_labels,preds)

到此這篇關(guān)于python機(jī)器學(xué)習(xí)sklearn實(shí)現(xiàn)識別數(shù)字的文章就介紹到這了,更多相關(guān)python sklearn識別數(shù)字內(nèi)容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家！

您可能感興趣的文章: