使用C#實(shí)現(xiàn)簡(jiǎn)單的線(xiàn)性回歸的代碼詳解
前言
最近注意到了NumSharp,想學(xué)習(xí)一下,最好的學(xué)習(xí)方式就是去實(shí)踐,因此從github上找了一個(gè)用python實(shí)現(xiàn)的簡(jiǎn)單線(xiàn)性回歸代碼,然后基于NumSharp用C#進(jìn)行了改寫(xiě)。
NumSharp簡(jiǎn)介
NumSharp(NumPy for C#)是一個(gè)在C#中實(shí)現(xiàn)的多維數(shù)組操作庫(kù),它的設(shè)計(jì)受到了Python中的NumPy庫(kù)的啟發(fā)。NumSharp提供了類(lèi)似于NumPy的數(shù)組對(duì)象,以及對(duì)這些數(shù)組進(jìn)行操作的豐富功能。它是一個(gè)開(kāi)源項(xiàng)目,旨在為C#開(kāi)發(fā)者提供在科學(xué)計(jì)算、數(shù)據(jù)分析和機(jī)器學(xué)習(xí)等領(lǐng)域進(jìn)行高效數(shù)組處理的工具。
python代碼
用到的python代碼來(lái)源:https://github.com/llSourcell/linear_regression_live
下載到本地之后,如下圖所示:
python代碼如下所示:
#The optimal values of m and b can be actually calculated with way less effort than doing a linear regression. #this is just to demonstrate gradient descent ? from numpy import * ? # y = mx + b # m is slope, b is y-intercept def compute_error_for_line_given_points(b, m, points): totalError = 0 for i in range(0, len(points)): x = points[i, 0] y = points[i, 1] totalError += (y - (m * x + b)) ** 2 return totalError / float(len(points)) ? def step_gradient(b_current, m_current, points, learningRate): b_gradient = 0 m_gradient = 0 N = float(len(points)) for i in range(0, len(points)): x = points[i, 0] y = points[i, 1] b_gradient += -(2/N) * (y - ((m_current * x) + b_current)) m_gradient += -(2/N) * x * (y - ((m_current * x) + b_current)) new_b = b_current - (learningRate * b_gradient) new_m = m_current - (learningRate * m_gradient) return [new_b, new_m] ? def gradient_descent_runner(points, starting_b, starting_m, learning_rate, num_iterations): b = starting_b m = starting_m for i in range(num_iterations): b, m = step_gradient(b, m, array(points), learning_rate) return [b, m] ? def run(): points = genfromtxt("data.csv", delimiter=",") learning_rate = 0.0001 initial_b = 0 # initial y-intercept guess initial_m = 0 # initial slope guess num_iterations = 1000 print ("Starting gradient descent at b = {0}, m = {1}, error = {2}".format(initial_b, initial_m, compute_error_for_line_given_points(initial_b, initial_m, points))) print ("Running...") [b, m] = gradient_descent_runner(points, initial_b, initial_m, learning_rate, num_iterations) print ("After {0} iterations b = {1}, m = {2}, error = {3}".format(num_iterations, b, m, compute_error_for_line_given_points(b, m, points))) ? if __name__ == '__main__': run()
用C#進(jìn)行改寫(xiě)
首先創(chuàng)建一個(gè)C#控制臺(tái)應(yīng)用,添加NumSharp包:
現(xiàn)在我們開(kāi)始一步步用C#進(jìn)行改寫(xiě)。
python代碼:
points = genfromtxt("data.csv", delimiter=",")
在NumSharp中沒(méi)有g(shù)enfromtxt方法需要自己寫(xiě)一個(gè)。
C#代碼:
//創(chuàng)建double類(lèi)型的列表 List<double> Array = new List<double>(); ? // 指定CSV文件的路徑 string filePath = "你的data.csv路徑"; ? // 調(diào)用ReadCsv方法讀取CSV文件數(shù)據(jù) Array = ReadCsv(filePath); ? var array = np.array(Array).reshape(100,2); ? static List<double> ReadCsv(string filePath) { List<double> array = new List<double>(); try { // 使用File.ReadAllLines讀取CSV文件的所有行 string[] lines = File.ReadAllLines(filePath); ? // 遍歷每一行數(shù)據(jù) foreach (string line in lines) { // 使用逗號(hào)分隔符拆分每一行的數(shù)據(jù) string[] values = line.Split(','); ? // 打印每一行的數(shù)據(jù) foreach (string value in values) { array.Add(Convert.ToDouble(value)); } } } catch (Exception ex) { Console.WriteLine("發(fā)生錯(cuò)誤: " + ex.Message); } return array; }
python代碼:
def compute_error_for_line_given_points(b, m, points): totalError = 0 for i in range(0, len(points)): x = points[i, 0] y = points[i, 1] totalError += (y - (m * x + b)) ** 2 return totalError / float(len(points))
這是在計(jì)算均方誤差:
C#代碼:
public static double compute_error_for_line_given_points(double b,double m,NDArray array) { double totalError = 0; for(int i = 0;i < array.shape[0];i++) { double x = array[i, 0]; double y = array[i, 1]; totalError += Math.Pow((y - (m*x+b)),2); } return totalError / array.shape[0]; }
python代碼:
def gradient_descent_runner(points, starting_b, starting_m, learning_rate, num_iterations): b = starting_b m = starting_m for i in range(num_iterations): b, m = step_gradient(b, m, array(points), learning_rate) return [b, m]
def step_gradient(b_current, m_current, points, learningRate): b_gradient = 0 m_gradient = 0 N = float(len(points)) for i in range(0, len(points)): x = points[i, 0] y = points[i, 1] b_gradient += -(2/N) * (y - ((m_current * x) + b_current)) m_gradient += -(2/N) * x * (y - ((m_current * x) + b_current)) new_b = b_current - (learningRate * b_gradient) new_m = m_current - (learningRate * m_gradient) return [new_b, new_m]
這是在用梯度下降來(lái)迭代更新y = mx + b中參數(shù)b、m的值。
因?yàn)樵诒纠?,誤差的大小是通過(guò)均方差來(lái)體現(xiàn)的,所以均方差就是成本函數(shù)(cost function)或者叫損失函數(shù)(loss function),我們想要找到一組b、m的值,讓誤差最小。
成本函數(shù)如下:
對(duì)θ1求偏導(dǎo),θ1就相當(dāng)于y = mx + b中的b:
再對(duì)θ2求偏導(dǎo),θ2就相當(dāng)于y = mx + b中的m:
使用梯度下降:
θ1與θ2的表示:
α是學(xué)習(xí)率,首先θ1、θ2先隨機(jī)設(shè)一個(gè)值,剛開(kāi)始梯度變化很大,后面慢慢趨于0,當(dāng)梯度等于0時(shí),θ1與θ2的值就不會(huì)改變了,或者達(dá)到我們?cè)O(shè)置的迭代次數(shù)了,就不再繼續(xù)迭代了。關(guān)于原理這方面的解釋?zhuān)梢圆榭催@個(gè)鏈接(https://www.geeksforgeeks.org/ml-linear-regression/),本文中使用的圖片也來(lái)自這里。
總之上面的python代碼在用梯度下降迭代來(lái)找最合適的參數(shù),現(xiàn)在用C#進(jìn)行改寫(xiě):
public static double[] gradient_descent_runner(NDArray array, double starting_b, double starting_m, double learningRate,double num_iterations) { double[] args = new double[2]; args[0] = starting_b; args[1] = starting_m; ? for(int i = 0 ; i < num_iterations; i++) { args = step_gradient(args[0], args[1], array, learningRate); } ? return args; }
public static double[] step_gradient(double b_current,double m_current,NDArray array,double learningRate) { double[] args = new double[2]; double b_gradient = 0; double m_gradient = 0; double N = array.shape[0]; ? for (int i = 0; i < array.shape[0]; i++) { double x = array[i, 0]; double y = array[i, 1]; b_gradient += -(2 / N) * (y - ((m_current * x) + b_current)); m_gradient += -(2 / N) * x * (y - ((m_current * x) + b_current)); } ? double new_b = b_current - (learningRate * b_gradient); double new_m = m_current - (learningRate * m_gradient); args[0] = new_b; args[1] = new_m; ? return args; }
用C#改寫(xiě)的全部代碼:
using NumSharp; ? namespace LinearRegressionDemo { internal class Program { static void Main(string[] args) { //創(chuàng)建double類(lèi)型的列表 List<double> Array = new List<double>(); ? // 指定CSV文件的路徑 string filePath = "你的data.csv路徑"; ? // 調(diào)用ReadCsv方法讀取CSV文件數(shù)據(jù) Array = ReadCsv(filePath); ? var array = np.array(Array).reshape(100,2); ? double learning_rate = 0.0001; double initial_b = 0; double initial_m = 0; double num_iterations = 1000; ? Console.WriteLine($"Starting gradient descent at b = {initial_b}, m = {initial_m}, error = {compute_error_for_line_given_points(initial_b, initial_m, array)}"); Console.WriteLine("Running..."); double[] Args =gradient_descent_runner(array, initial_b, initial_m, learning_rate, num_iterations); Console.WriteLine($"After {num_iterations} iterations b = {Args[0]}, m = {Args[1]}, error = {compute_error_for_line_given_points(Args[0], Args[1], array)}"); Console.ReadLine(); ? } ? static List<double> ReadCsv(string filePath) { List<double> array = new List<double>(); try { // 使用File.ReadAllLines讀取CSV文件的所有行 string[] lines = File.ReadAllLines(filePath); ? // 遍歷每一行數(shù)據(jù) foreach (string line in lines) { // 使用逗號(hào)分隔符拆分每一行的數(shù)據(jù) string[] values = line.Split(','); ? // 打印每一行的數(shù)據(jù) foreach (string value in values) { array.Add(Convert.ToDouble(value)); } } } catch (Exception ex) { Console.WriteLine("發(fā)生錯(cuò)誤: " + ex.Message); } return array; } ? public static double compute_error_for_line_given_points(double b,double m,NDArray array) { double totalError = 0; for(int i = 0;i < array.shape[0];i++) { double x = array[i, 0]; double y = array[i, 1]; totalError += Math.Pow((y - (m*x+b)),2); } return totalError / array.shape[0]; } ? public static double[] step_gradient(double b_current,double m_current,NDArray array,double learningRate) { double[] args = new double[2]; double b_gradient = 0; double m_gradient = 0; double N = array.shape[0]; ? for (int i = 0; i < array.shape[0]; i++) { double x = array[i, 0]; double y = array[i, 1]; b_gradient += -(2 / N) * (y - ((m_current * x) + b_current)); m_gradient += -(2 / N) * x * (y - ((m_current * x) + b_current)); } ? double new_b = b_current - (learningRate * b_gradient); double new_m = m_current - (learningRate * m_gradient); args[0] = new_b; args[1] = new_m; ? return args; } ? public static double[] gradient_descent_runner(NDArray array, double starting_b, double starting_m, double learningRate,double num_iterations) { double[] args = new double[2]; args[0] = starting_b; args[1] = starting_m; ? for(int i = 0 ; i < num_iterations; i++) { args = step_gradient(args[0], args[1], array, learningRate); } ? return args; } ? ? } }
python代碼的運(yùn)行結(jié)果:
C#代碼的運(yùn)行結(jié)果:
結(jié)果相同,說(shuō)明改寫(xiě)成功。
總結(jié)
本文基于NumSharp用C#改寫(xiě)了一個(gè)用python實(shí)現(xiàn)的簡(jiǎn)單線(xiàn)性回歸,通過(guò)這次實(shí)踐,可以加深對(duì)線(xiàn)性回歸原理的理解,也可以練習(xí)使用NumSharp。
以上就是使用C#實(shí)現(xiàn)簡(jiǎn)單的線(xiàn)性回歸的代碼詳解的詳細(xì)內(nèi)容,更多關(guān)于C#實(shí)現(xiàn)線(xiàn)性回歸的資料請(qǐng)關(guān)注腳本之家其它相關(guān)文章!
相關(guān)文章
c# Winform 程序自動(dòng)更新實(shí)現(xiàn)方法
Winform程序自動(dòng)更新我也是第一次做,網(wǎng)上找了自動(dòng)更新的源碼,后來(lái)又根據(jù)在網(wǎng)上看到的一些方法,自己試了很久,最終還是有寫(xiě)錯(cuò)誤,所以花了錢(qián)讓別人幫忙調(diào)試成功的,下面是我自己搗騰出來(lái)的,方便大家借鑒,如果有什么錯(cuò)誤的地方歡迎指正2017-02-02C#數(shù)據(jù)庫(kù)操作類(lèi)AccessHelper實(shí)例
這篇文章主要介紹了C#數(shù)據(jù)庫(kù)操作類(lèi)AccessHelper實(shí)例,可實(shí)現(xiàn)針對(duì)access數(shù)據(jù)庫(kù)的各種常見(jiàn)操作,非常具有實(shí)用價(jià)值,需要的朋友可以參考下2014-10-10詳解StackExchange.Redis通用封裝類(lèi)分享
這篇文章主要介紹了詳解StackExchange.Redis通用封裝類(lèi)分享 ,詳細(xì)的介紹了StackExchange.Redis通用封裝,具有一定的參考價(jià)值,有需要的可以了解一下。2016-12-12基于C#實(shí)現(xiàn)的多邊形沖突檢測(cè)實(shí)例
這篇文章主要給大家介紹了基于C#實(shí)現(xiàn)的多邊形沖突檢測(cè)的相關(guān)資料,文中介紹的方法并未使用第三方類(lèi)庫(kù),可以完美解決這個(gè)問(wèn)題,需要的朋友可以參考下2021-07-07