python使用pandas抽樣訓練數(shù)據(jù)中某個類別實例
更新時間:2020年02月28日 11:25:28 作者:Yan456jie
今天小編就為大家分享一篇python使用pandas抽樣訓練數(shù)據(jù)中某個類別實例,具有很好的參考價值,希望對大家有所幫助。一起跟隨小編過來看看吧
廢話真的一句也不想多說,直接看代碼吧!
# -*- coding: utf-8 -*- import numpy from sklearn import metrics from sklearn.svm import LinearSVC from sklearn.naive_bayes import MultinomialNB from sklearn import linear_model from sklearn.datasets import load_iris from sklearn.cross_validation import train_test_split from sklearn.preprocessing import OneHotEncoder, StandardScaler from sklearn import cross_validation from sklearn import preprocessing import scipy as sp from sklearn.linear_model import LogisticRegression from sklearn.feature_selection import SelectKBest ,chi2 import pandas as pd from sklearn.preprocessing import OneHotEncoder #import iris_data ''' creativeID,userID,positionID,clickTime,conversionTime,connectionType, telecomsOperator,appPlatform,sitesetID,positionType,age,gender, education,marriageStatus,haveBaby,hometown,residence,appID,appCategory,label ''' def test(): df = pd.read_table("/var/lib/mysql-files/data1.csv", sep=",") df1 = df[["connectionType","telecomsOperator","appPlatform","sitesetID", "positionType","age","gender","education","marriageStatus", "haveBaby","hometown","residence","appCategory","label"]] print df1["label"].value_counts() N_data = df1[df1["label"]==0] P_data = df1[df1["label"]==1] N_data = N_data.sample(n=P_data.shape[0], frac=None, replace=False, weights=None, random_state=2, axis=0) #print df1.loc[:,"label"]==0 print P_data.shape print N_data.shape data = pd.concat([N_data,P_data]) print data.shape data = data.sample(frac=1).reset_index(drop=True) print data[["label"]] return
補充拓展:pandas實現(xiàn)對dataframe抽樣
隨機抽樣
import pandas as pd #對dataframe隨機抽取2000個樣本 pd.sample(df, n=2000)
分層抽樣
利用sklean中的函數(shù)靈活進行抽樣
from sklearn.model_selection import train_test_split #y是在X中的某一個屬性列 X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.2, stratify=y)
以上這篇python使用pandas抽樣訓練數(shù)據(jù)中某個類別實例就是小編分享給大家的全部內容了,希望能給大家一個參考,也希望大家多多支持腳本之家。
相關文章
python實現(xiàn)將html表格轉換成CSV文件的方法
這篇文章主要介紹了python實現(xiàn)將html表格轉換成CSV文件的方法,涉及Python操作csv文件的相關技巧,需要的朋友可以參考下2015-06-06Python實現(xiàn)接受任意個數(shù)參數(shù)的函數(shù)方法
下面小編就為大家分享一篇Python實現(xiàn)接受任意個數(shù)參數(shù)的函數(shù)方法,具有很好的參考價值,希望對大家有所幫助。一起跟隨小編過來看看吧2018-04-04