Python聚類算法之基本K均值實例詳解
更新時間:2015年11月20日 10:44:19 作者:intergret
這篇文章主要介紹了Python聚類算法之基本K均值運算技巧,結合實例形式較為詳細的分析了基本K均值的原理與相關實現技巧,具有一定參考借鑒價值,需要的朋友可以參考下
本文實例講述了Python聚類算法之基本K均值運算技巧。分享給大家供大家參考,具體如下:
基本K均值 :選擇 K 個初始質心,其中 K 是用戶指定的參數,即所期望的簇的個數。每次循環(huán)中,每個點被指派到最近的質心,指派到同一個質心的點集構成一個。然后,根據指派到簇的點,更新每個簇的質心。重復指派和更新操作,直到質心不發(fā)生明顯的變化。
# scoding=utf-8 import pylab as pl points = [[int(eachpoint.split("#")[0]), int(eachpoint.split("#")[1])] for eachpoint in open("points","r")] # 指定三個初始質心 currentCenter1 = [20,190]; currentCenter2 = [120,90]; currentCenter3 = [170,140] pl.plot([currentCenter1[0]], [currentCenter1[1]],'ok') pl.plot([currentCenter2[0]], [currentCenter2[1]],'ok') pl.plot([currentCenter3[0]], [currentCenter3[1]],'ok') # 記錄每次迭代后每個簇的質心的更新軌跡 center1 = [currentCenter1]; center2 = [currentCenter2]; center3 = [currentCenter3] # 三個簇 group1 = []; group2 = []; group3 = [] for runtime in range(50): group1 = []; group2 = []; group3 = [] for eachpoint in points: # 計算每個點到三個質心的距離 distance1 = pow(abs(eachpoint[0]-currentCenter1[0]),2) + pow(abs(eachpoint[1]-currentCenter1[1]),2) distance2 = pow(abs(eachpoint[0]-currentCenter2[0]),2) + pow(abs(eachpoint[1]-currentCenter2[1]),2) distance3 = pow(abs(eachpoint[0]-currentCenter3[0]),2) + pow(abs(eachpoint[1]-currentCenter3[1]),2) # 將該點指派到離它最近的質心所在的簇 mindis = min(distance1,distance2,distance3) if(mindis == distance1): group1.append(eachpoint) elif(mindis == distance2): group2.append(eachpoint) else: group3.append(eachpoint) # 指派完所有的點后,更新每個簇的質心 currentCenter1 = [sum([eachpoint[0] for eachpoint in group1])/len(group1),sum([eachpoint[1] for eachpoint in group1])/len(group1)] currentCenter2 = [sum([eachpoint[0] for eachpoint in group2])/len(group2),sum([eachpoint[1] for eachpoint in group2])/len(group2)] currentCenter3 = [sum([eachpoint[0] for eachpoint in group3])/len(group3),sum([eachpoint[1] for eachpoint in group3])/len(group3)] # 記錄該次對質心的更新 center1.append(currentCenter1) center2.append(currentCenter2) center3.append(currentCenter3) # 打印所有的點,用顏色標識該點所屬的簇 pl.plot([eachpoint[0] for eachpoint in group1], [eachpoint[1] for eachpoint in group1], 'or') pl.plot([eachpoint[0] for eachpoint in group2], [eachpoint[1] for eachpoint in group2], 'oy') pl.plot([eachpoint[0] for eachpoint in group3], [eachpoint[1] for eachpoint in group3], 'og') # 打印每個簇的質心的更新軌跡 for center in [center1,center2,center3]: pl.plot([eachcenter[0] for eachcenter in center], [eachcenter[1] for eachcenter in center],'k') pl.show()
運行效果截圖如下:
希望本文所述對大家Python程序設計有所幫助。
相關文章
Python的Tornado框架實現異步非阻塞訪問數據庫的示例
Tornado框架的異步非阻塞特性是其最大的亮點,這里我們將立足于基礎來介紹一種簡單的Python的Tornado框架實現異步非阻塞訪問數據庫的示例:2016-06-06Python cookbook(數據結構與算法)實現對不原生支持比較操作的對象排序算法示例
這篇文章主要介紹了Python cookbook(數據結構與算法)實現對不原生支持比較操作的對象排序算法,結合實例形式分析了Python針對類實例進行排序相關操作技巧,需要的朋友可以參考下2018-03-03