如何使用pytorch構(gòu)建高斯混合模型分類器

更新時間：2023年10月22日 09:34:29 作者：deephub

本文是一個利用Pytorch構(gòu)建高斯混合模型分類器的嘗試,我們將從頭開始構(gòu)建高斯混合模型(GMM),這樣可以對高斯混合模型有一個最基本的理解,本文不會涉及數(shù)學(xué),需要的朋友可以參考下

 n_samples = 1000
 A_means = torch.tensor( [-0.5, -0.5])
 A_stdevs = torch.tensor( [0.25, 0.25])
 B_means = torch.tensor( [0.5, 0.5])
 B_stdevs = torch.tensor( [0.25, 0.25])
 
 A_dist = torch.distributions.Normal( A_means, A_stdevs)
 A_samp = A_dist.sample( [n_samples])
 B_dist = torch.distributions.Normal( B_means, B_stdevs)
 B_samp = B_dist.sample( [n_samples])
 
 
 plt.figure( figsize=(6,6))
 for name, sample in zip( ['A', 'B'], [A_samp, B_samp]):
     plt.scatter( sample[:,0], sample[:, 1], alpha=0.2, label=name)
 plt.legend()
 plt.title( "Distinct Gaussian Samples")
 plt.show()
 plt.close()

為了創(chuàng)建一個單一的混合高斯分布，我們首先垂直堆疊a和B的均值和標(biāo)準(zhǔn)差，生成新的張量，每個張量的形狀=[2,2]。

 AB_means = torch.vstack( [ A_means, B_means])
 AB_stdevs = torch.vstack( [ A_stdevs, B_stdevs])

pytorch混合分布的工作方式是通過在原始的Normal分布上使用3個額外的分布Independent、Categorical和MixtureSameFamily來實現(xiàn)的。從本質(zhì)上講，它創(chuàng)建了一個混合，基于給定Categorical分布的概率權(quán)重。因為我們的新均值和標(biāo)準(zhǔn)集有一個額外的軸，這個軸被用作獨立的軸，需要決定從中得出哪個均值/標(biāo)準(zhǔn)集的值。

 AB_means = torch.vstack( [ A_means, B_means])
 AB_stdevs = torch.vstack( [ A_stdevs, B_stdevs])
 
 AB_dist = torch.distributions.Independent( torch.distributions.Normal( AB_means, AB_stdevs), 1)
 mix_weight = torch.distributions.Categorical( torch.tensor( [1.0, 1.0]))
 mix_dist = torch.distributions.MixtureSameFamily( mix_weight, AB_dist)

在這里用[1.0,1.0]表示Categorical分布應(yīng)該從每個獨立的軸上均勻取樣。為了驗證它是否有效，我們將繪制每個分布的值…

 A_samp = A_dist.sample( (500,))
 B_samp = B_dist.sample( (500,))
 mix_samp = mix_dist.sample( (500,))
 plt.figure( figsize=(6,6))
 for name, sample in zip( ['A', 'B', 'mix'], [A_samp, B_samp, mix_samp]):
     plt.scatter( sample[:,0], sample[:, 1], alpha=0.3, label=name)
 plt.legend()
 plt.title( "Original Samples with the new Mixed Distribution")
 plt.show()
 plt.close()

可以看到，的新mix_samp分布實際上與我們原來的兩個單獨的A和B分布樣本重疊。

模型

下面就可以開始構(gòu)建我們的分類器了

首先需要創(chuàng)建一個底層的GaussianMixModel，它的means、stdev和分類權(quán)重實際上可以通過torch backprop和autograd系統(tǒng)進(jìn)行訓(xùn)練。

 class GaussianMixModel( torch.nn.Module):
     
     def __init__(self, n_features, n_components=2):
         super().__init__()
 
         self.init_scale = np.sqrt( 6 / n_features) # What is the best scale to use?
         self.n_features = n_features
         self.n_components = n_components
 
         weights = torch.ones( n_components)
         means = torch.randn( n_components, n_features) * self.init_scale
         stdevs = torch.rand( n_components, n_features) * self.init_scale
         
         #
         # Our trainable Parameters
         self.blend_weight = torch.nn.Parameter(weights)
         self.means = torch.nn.Parameter(means)
         self.stdevs = torch.nn.Parameter(stdevs)
 
     
     def forward(self, x):
 
         blend_weight = torch.distributions.Categorical( torch.nn.functional.relu( self.blend_weight))
         comp = torch.distributions.Independent(torch.distributions.Normal( self.means, torch.abs( self.stdevs)), 1)
         gmm = torch.distributions.MixtureSameFamily( blend_weight, comp)
         return -gmm.log_prob(x)
     
     def extra_repr(self) -> str:
         info = f" n_features={self.n_features}, n_components={self.n_components}, [init_scale={self.init_scale}]"
         return info
 
     @property
     def device(self):
         return next(self.parameters()).device

該模型將返回落在模型的混合高斯分布域中的每個樣本的負(fù)對數(shù)似然。

為了訓(xùn)練它，我們需要從混合高斯分布中提供樣本。為了驗證它是有效的，將提供一個普遍分布的一批樣本，看看它是否可以，哪些樣本可能與我們的訓(xùn)練集中的樣本相似。

 train_means = torch.randn( (4,2))
 train_stdevs = (torch.rand( (4,2)) + 1.0) * 0.25
 train_weights = torch.rand( 4)
 ind_dists = torch.distributions.Independent( torch.distributions.Normal( train_means, train_stdevs), 1)
 mix_weight = torch.distributions.Categorical( train_weights)
 train_dist = torch.distributions.MixtureSameFamily( mix_weight, ind_dists)
 
 train_samp = train_dist.sample( [2000])
 valid_samp = torch.rand( (4000, 2)) * 8 - 4.0
 
 plt.figure( figsize=(6,6))
 for name, sample in zip( ['train', 'valid'], [train_samp, valid_samp]):
     plt.scatter( sample[:,0], sample[:, 1], alpha=0.2, label=name)
 plt.legend()
 plt.title( "Training and Validation Samples")
 plt.show()
 plt.close()

模型只需要一個超參數(shù)n_components：

 gmm = GaussianMixModel( n_features=2, n_components=4)
 gmm.to( 'cuda')

訓(xùn)練的循環(huán)也非常簡單：

 max_iter = 20000
 features = train_samp.to( 'cuda')
 
 optim = torch.optim.Adam( gmm.parameters(),  lr=5e-4)
 metrics = {'loss':[]}
 
 for i in range( max_iter):
     optim.zero_grad()
     loss = gmm(  features)
     loss.mean().backward()
     optim.step()
     metrics[ 'loss'].append( loss.mean().item())
     print( f"{i} ) \t {metrics[ 'loss'][-1]:0.5f}", end=f"{' '*20}\r")
     if metrics[ 'loss'][-1] < 0.1:
         print( "---- Close enough")
         break
     if len( metrics[ 'loss']) > 300 and np.std( metrics[ 'loss'][-300:]) < 0.0005:
         print( "---- Giving up")
         break
 print( f"Min Loss: {np.min( metrics[ 'loss']):0.5f}")

在這個例子中，循環(huán)在在1.91043的損失時停止了不到7000次迭代。

如果我們現(xiàn)在通過模型運行valid_samp樣本，可以將返回值轉(zhuǎn)換為相對概率，并重新繪制由預(yù)測著色的驗證數(shù)據(jù)。

 with torch.no_grad():
     logits = gmm( valid_samp.to( 'cuda'))
     probs = torch.exp( -logits)
     
 plt.figure( figsize=(6,6))
 for name, sample in zip( ['pred'], [valid_samp]):
     plt.scatter( sample[:,0], sample[:, 1], alpha=1.0, c=probs.cpu().numpy(), label=name)
 plt.legend()
 plt.title( "Testing Trained model on Validation")
 plt.show()
 plt.close()

我們的模型已經(jīng)學(xué)會了識別與訓(xùn)練分布區(qū)域?qū)?yīng)的樣本。但是我們還可以進(jìn)行改進(jìn)

分類

通過上面的介紹應(yīng)該已經(jīng)對如何創(chuàng)建高斯混合模型以及如何訓(xùn)練它有了大致的了解，下一步將使用這些信息來構(gòu)建一個復(fù)合(GMMClassifier)模型，該模型可以學(xué)習(xí)識別混合高斯分布的不同類別。

這里創(chuàng)建了一個重疊高斯分布的訓(xùn)練集，5個不同的類，其中每個類本身是一個混合高斯分布。

這個GMMClassifier將包含5個不同的GaussianMixModel實例。每個實例都會嘗試從訓(xùn)練數(shù)據(jù)中學(xué)習(xí)一個單獨的類。每個預(yù)測將組合成一組分類邏輯，GMMClassifier將使用這些邏輯進(jìn)行預(yù)測。

首先需要對原始的GaussianMixModel做一個小的修改，并將輸出從return -gmm.log_prob(x)更改為return gmm.log_prob(x)。因為我們沒有在訓(xùn)練循環(huán)中直接嘗試減少這個值，所以它被用作我們分類分配的logits。

新的模型就變成了……

 class GaussianMixModel( torch.nn.Module):
     
     def __init__(self, n_features, n_components=2):
         super().__init__()
 
         self.init_scale = np.sqrt( 6 / n_features) # What is the best scale to use?
         self.n_features = n_features
         self.n_components = n_components
 
         weights = torch.ones( n_components)
         means = torch.randn( n_components, n_features) * self.init_scale
         stdevs = torch.rand( n_components, n_features) * self.init_scale
         
         #
         # Our trainable Parameters
         self.blend_weight = torch.nn.Parameter(weights)
         self.means = torch.nn.Parameter(means)
         self.stdevs = torch.nn.Parameter(stdevs)
 
     
     def forward(self, x):
 
         blend_weight = torch.distributions.Categorical( torch.nn.functional.relu( self.blend_weight))
         comp = torch.distributions.Independent(torch.distributions.Normal( self.means, torch.abs( self.stdevs)), 1)
         gmm = torch.distributions.MixtureSameFamily( blend_weight, comp)
         return gmm.log_prob(x)
     
     def extra_repr(self) -> str:
         info = f" n_features={self.n_features}, n_components={self.n_components}, [init_scale={self.init_scale}]"
         return info
 
     @property
     def device(self):
         return next(self.parameters()).device

我們的GMMClassifier的代碼如下：

 class GMMClassifier( torch.nn.Module):
     
     def __init__(self, n_features, n_classes, n_components=2):
         super().__init__()
         self.n_classes = n_classes
         self.n_features = n_features
         self.n_components = n_components if isinstance( n_components, list) else [n_components] * self.n_classes
         self.class_models = torch.nn.ModuleList( [ GaussianMixModel( n_features=self.n_features, n_components=self.n_components[i]) for i in range( self.n_classes)])
         
     
     def forward(self, x, ret_logits=False):
         logits = torch.hstack( [ m(x).unsqueeze(1) for m in self.class_models])
         if ret_logits:
             return logits
         return logits.argmax( dim=1)
     
     def extra_repr(self) -> str:
         info = f" n_features={self.n_features}, n_components={self.n_components}, [n_classes={self.n_classes}]"
         return info
 
     @property
     def device(self):
         return next(self.parameters()).device

創(chuàng)建模型實例時，將為每個類創(chuàng)建一個GaussianMixModel。由于每個類對于其特定的高斯混合可能具有不同數(shù)量的組件，因此我們允許n_components是一個int值列表，該列表將在生成每個底層模型時使用。例如:n_components=[2,4,3,5,6]將向類模型傳遞正確數(shù)量的組件。為了簡化將所有底層模型設(shè)置為相同的值，也可以簡單地提供n_components=5，這將在生成模型時產(chǎn)生[5,5,5,5,5]。

在訓(xùn)練期間，需要訪問logits，因此forward()方法中提供了ret_logits參數(shù)。訓(xùn)練完成后，可以在不帶參數(shù)的情況下調(diào)用forward()，以便為預(yù)測的類返回一個int值(它只接受logits的argmax())。

我們還將創(chuàng)建一組5個獨立但重疊的高斯混合分布，每個類有隨機(jī)數(shù)量的高斯分量。

 clusters = [0, 1, 2, 3, 4]
 features_group = {}
 n_samples = 2000
 min_clusters = 2
 max_clusters = 10
 for c in clusters:
     features_group[ c] = []
     n_clusters = torch.randint( min_clusters, max_clusters+1, (1,1)).item()
     print( f"Class: {c} Clusters: {n_clusters}")
     for i in range( n_clusters):
         mu = torch.randn( (1,2))
         scale = torch.rand( (1,2)) * 0.35 + 0.05
         distribution = torch.distributions.Normal( mu, scale)
         features_group[ c] += distribution.expand( (n_samples//n_clusters, 2)).sample()
     features_group[ c] = torch.vstack( features_group[ c])
 features = torch.vstack( [features_group[ c] for c in clusters]).numpy()
 targets = torch.vstack( [torch.ones( (features_group[ c].size(0), 1)) * c for c in clusters]).view( -1).numpy()
 
 idxs = np.arange( features.shape[0])
 valid_idxs = np.random.choice( idxs, 1000)
 train_idxs = [i for i in idxs if i not in valid_idxs]
 features_valid = torch.tensor( features[ valid_idxs])
 targets_valid = torch.tensor( targets[ valid_idxs])
 features = torch.tensor( features[ train_idxs])
 targets = torch.tensor( targets[ train_idxs])
 
 print( features.shape)
 plt.figure( figsize=(8,8))
 for c in clusters:
     plt.scatter( features_group[c][:,0].numpy(), features_group[c][:,1].numpy(), alpha=0.2, label=c)
 plt.title( f"{n_samples} Samples Per Class, Multiple Clusters per Class")
 plt.legend()

通過運行上面的代碼，我們可以知道每個類使用的n_component的數(shù)量。在實際中他應(yīng)該是一個超參數(shù)搜索過程，但是這里我們已經(jīng)知道了，所以我們直接使用它

 Class: 0 Clusters: 3
 Class: 1 Clusters: 5
 Class: 2 Clusters: 2
 Class: 3 Clusters: 8
 Class: 4 Clusters: 4

然后創(chuàng)建模型：

 gmmc = GMMClassifier(  n_features=2, n_classes=5, n_components=[3, 5, 2, 8, 4])
 gmmc.to( 'cuda')

訓(xùn)練循環(huán)也有一些修改，因為這次想要訓(xùn)練由logit預(yù)測提供的模型的分類損失。所以需要在監(jiān)督學(xué)習(xí)的訓(xùn)練過程中提供目標(biāo)。

 features = features.to( DEVICE)
 targets = targets.to( DEVICE)
 
 optim = torch.optim.Adam( gmmc.parameters(), lr=3e-2)
 loss_fn = torch.nn.CrossEntropyLoss()
 metrics = {'loss':[]}
 for i in range(4000):
     optim.zero_grad()
     logits = gmmc(  features, ret_logits=True)
     loss = loss_fn( logits, targets.type( torch.long))
     loss.backward()
     optim.step()
     metrics[ 'loss'].append( loss.item())
     print( f"{i} ) \t {metrics[ 'loss'][-1]:0.5f}", end=f"{' '*20}\r")
     if metrics[ 'loss'][-1] < 0.1:
         print( "---- Close enough")
         break
 print( f"Mean Loss: {np.mean( metrics[ 'loss']):0.5f}")

然后從驗證數(shù)據(jù)中對數(shù)據(jù)進(jìn)行分類，驗證數(shù)據(jù)是在創(chuàng)建訓(xùn)練數(shù)據(jù)時生成的，每個樣本基本上都是不同的值，但來自適當(dāng)?shù)念悺?/p>

 preds = gmmc( features_valid.to( 'cuda'))

查看preds值，可以看到它們是預(yù)測類的整數(shù)。

 print( preds[0:10])
 
 ____
 tensor([2, 4, 2, 4, 2, 3, 4, 0, 2, 2], device='cuda:1')

最后通過將這些值與targets_valid進(jìn)行比較，可以確定模型的準(zhǔn)確性。

 accuracy = (targets_valid == preds).sum() / targets_valid.size(0) * 100.0
 print( f"Accuracy: {accuracy:0.2f}%")
 
 ____
 Accuracy: 81.50%

還可以查看每個類別預(yù)測的準(zhǔn)確性……

 class_acc = {}
 for c in range(5):
     target_idxs = (targets_valid == c)
     class_acc[c] = (targets_valid[ target_idxs] == preds[ target_idxs]).sum() / targets_valid[ target_idxs].size(0) * 100.0
     print( f"Class: {c} \t{class_acc[c]:0.2f}%")
 
 ----
 Class: 0  98.54%
 Class: 1  69.06%
 Class: 2  86.12%
 Class: 3  70.05%
 Class: 4  84.09%

可以看到，它在預(yù)測重疊較少的類方面做得更好，這是有道理的。并且平均81.5%的準(zhǔn)確率也相當(dāng)不錯，因為所有這些不同的類別都是重疊的。我相信還有很多可以改進(jìn)的地方。如果你有建議，或者可以指出我所犯的錯誤，請留言。

以上就是如何使用pytorch實現(xiàn)高斯混合模型分類器的詳細(xì)內(nèi)容，更多關(guān)于pytorch高斯混合模型分類器的資料請關(guān)注腳本之家其它相關(guān)文章！

您可能感興趣的文章: