Evidently 是一個開源 Python 庫，用于創(chuàng)建交互式可視化報告、儀表板和 JSON 配置文件，有助于在驗證和預(yù)測期間分析機器學(xué)習(xí)模型。它可以創(chuàng)建 6 種不同類型的報告，這些報告與數(shù)據(jù)漂移、分類或回歸的模型性能等有關(guān)。

讓我們開始吧

1、安裝包

使用 pip 軟件包管理器安裝，運行

$ pip install evidently

該工具允許在 Jupyter notebook 中以及作為單獨的HTML文件構(gòu)建交互式報告。如果你只想將交互式報告生成為HTML文件或?qū)С鰹镴SON配置文件，則安裝現(xiàn)已完成。

為了能夠在 Jupyter notebook 中構(gòu)建交互式報告，我們使用Jupyter nbextension。如果想在 Jupyter notebook 中創(chuàng)建報告，那么在安裝之后，您應(yīng)該在 terminal 中運行以下兩個命令。

要安裝 jupyter Nbextion，請運行：

$ jupyter nbextension install --sys-prefix --symlink --overwrite --py evidently

運行

jupyter nbextension enable evidently --py --sys-prefix

有一點需要注意：安裝后單次運行就足夠了。無需每次都重復(fù)最后兩個命令。

2、導(dǎo)入所需的庫

在這一步中，我們將導(dǎo)入創(chuàng)建ML模型所需的庫。我們還將導(dǎo)入用于創(chuàng)建用于分析模型性能的儀表板的庫。此外，我們將導(dǎo)入 pandas 以加載數(shù)據(jù)集。

import pandas as pd
import numpy as np
from sklearn.ensemble import RandomForestRegressor
from evidently.dashboard import Dashboard
from evidently.tabs import RegressionPerformanceTab
from evidently.model_profile import Profile
from evidently.profile_sections import RegressionPerformanceProfileSection

3、加載數(shù)據(jù)集

在這一步中，我們將加載數(shù)據(jù)并將其分離為參考數(shù)據(jù)和預(yù)測數(shù)據(jù)。

raw_data = pd.read_csv('/content/day.csv', header = 0, sep = ',', parse_dates=['dteday'])
ref_data = raw_data[:120]
prod_data = raw_data[120:150]
ref_data.head()

4、創(chuàng)建模型

在這一步中，我們將創(chuàng)建機器學(xué)習(xí)模型，對于這個特定的數(shù)據(jù)集，我們將使用隨機森林回歸模型。

target = 'cnt'
datetime = 'dteday'
numerical_features = ['mnth', 'temp', 'atemp', 'hum', 'windspeed']
categorical_features = ['season', 'holiday', 'weekday', 'workingday', 'weathersit',]
features = numerical_features + categorical_features
model = RandomForestRegressor(random_state = 0)
model.fit(ref_data[features], ref_data[target])
ref_data['prediction']  = model.predict(ref_data[features])
prod_data['prediction'] = model.predict(prod_data[features])

5、創(chuàng)建儀表板

在這一步中，我們將創(chuàng)建儀表板來解釋模型性能并分析模型的不同屬性，如 MAE、MAPE、誤差分布等。

column_mapping = {}
column_mapping['target'] = target
column_mapping['prediction'] = 'prediction'
column_mapping['datetime'] = datetime
column_mapping['numerical_features'] = numerical_features
column_mapping['categorical_features'] = categorical_features
dashboard = Dashboard(tabs=[RegressionPerformanceTab])
dashboard .calculate(ref_data, prod_data, column_mapping=column_mapping)
dashboard.save('bike_sharing_demand_model_perfomance.html')

在上圖中，可以清楚地看到顯示模型性能的報告，可以使用上述代碼下載并創(chuàng)建的 HTML 報告。

6、可用報告類型

1)數(shù)據(jù)漂移

檢測特征分布的變化

2)數(shù)值目標(biāo)漂移

檢測數(shù)值目標(biāo)和特征行為的變化。

3)分類目標(biāo)漂移

檢測分類目標(biāo)和特征行為的變化

4)回歸模型性能

分析回歸模型的性能和模型誤差

5)分類模型性能

分析分類模型的性能和錯誤。適用于二元和多類模型

6)概率分類模型性能

分析概率分類模型的性能、模型校準(zhǔn)的質(zhì)量和模型錯誤。適用于二元和多類模型。

以上就是python數(shù)據(jù)挖掘使用Evidently創(chuàng)建機器學(xué)習(xí)模型儀表板的詳細(xì)內(nèi)容，更多關(guān)于Evidently創(chuàng)建機器學(xué)習(xí)模型儀表板的資料請關(guān)注腳本之家其它相關(guān)文章！

您可能感興趣的文章:

欧美bbbwbbbw肥妇,免费乱码人妻系列日韩,一级黄片

軟件下載

源碼下載

軟件編程

網(wǎng)絡(luò)編程

在線工具

數(shù)據(jù)庫

CMS

常用工具

python數(shù)據(jù)挖掘使用Evidently創(chuàng)建機器學(xué)習(xí)模型儀表板

目錄

1、安裝包

2、導(dǎo)入所需的庫

3、加載數(shù)據(jù)集

4、創(chuàng)建模型

5、創(chuàng)建儀表板

6、可用報告類型

1)數(shù)據(jù)漂移

2)數(shù)值目標(biāo)漂移

3)分類目標(biāo)漂移

4)回歸模型性能

5)分類模型性能

6)概率分類模型性能

相關(guān)文章

最新評論

大家感興趣的內(nèi)容

最近更新的內(nèi)容

常用在線小工具

python數(shù)據(jù)挖掘使用Evidently創(chuàng)建機器學(xué)習(xí)模型儀表板

目錄

1、安裝包

2、導(dǎo)入所需的庫

3、加載數(shù)據(jù)集

4、創(chuàng)建模型

5、創(chuàng)建儀表板

6、可用報告類型

1)數(shù)據(jù)漂移

2)數(shù)值目標(biāo)漂移

3)分類目標(biāo)漂移

4)回歸模型性能

5)分類模型性能

6)概率分類模型性能

相關(guān)文章

最新評論

大家感興趣的內(nèi)容

最近更新的內(nèi)容

常用在線小工具

1、安裝包