pandas求行最大值及其索引的實現

更新時間：2024年04月03日 09:29:31 作者：數據小白的進階之路

工作需要,查詢某一行中的最大值及其索引,本文主要介紹了pandas求行最大值及其索引的實現,具有一定的參考價值,感興趣的可以了解一下

在平時訓練完模型后，需要對模型預測的值做進一步的數據操作，例如在對模型得到類別的概率值按行取最大值，并將最大值所在的列單獨放一列。

數據格式如下：

array
array([[ 0.47288769,  0.23982215,  0.2261405 ,  0.06114962],
       [ 0.67969596,  0.11435176,  0.17647322,  0.02947907],
       [ 0.00621393,  0.01652142,  0.31117165,  0.66609299],
       [ 0.24093366,  0.23636758,  0.30113828,  0.22156043],
       [ 0.44093642,  0.2245989 ,  0.24515967,  0.08930501],
       [ 0.05540339,  0.10013942,  0.30361843,  0.54083872],
       [ 0.11221886,  0.75674808,  0.09237131,  0.03866173],
       [ 0.24885316,  0.28243011,  0.28312165,  0.18559511],
       [ 0.01205211,  0.03740638,  0.271065  ,  0.67947656]], dtype=float32)

想在想實現的功能是在上述DataFrame后面增加兩列：一列是最大值，一列是最大值所在的行索引。

首先先來了解一下argmax函數。

argmax(a, axis=None)

# a 表示DataFrame

# axis 表示指定的軸，默認是None，表示把array平鋪，等于1表示按行，等于0表示按列。

對于DataFrame來說，求解過程如下：

代碼如下：

#導入庫
import pandas as pd
import numpy as np
#將array轉化為DataFrame
arr=pd.DataFrame(array,columns=["one","two","three","four"])
#分別求行最大值及最大值所在索引
arr['max_value']=arr.max(axis=1)
arr['max_index']=np.argmax(array,axis=1)
#得出如下結果：
arr
Out[28]: 
        one       two     three      four  max_index  max_value
0  0.472888  0.239822  0.226140  0.061150          0   0.472888
1  0.679696  0.114352  0.176473  0.029479          0   0.679696
2  0.006214  0.016521  0.311172  0.666093          3   3.000000
3  0.240934  0.236368  0.301138  0.221560          2   2.000000
4  0.440936  0.224599  0.245160  0.089305          0   0.440936
5  0.055403  0.100139  0.303618  0.540839          3   3.000000
6  0.112219  0.756748  0.092371  0.038662          1   1.000000
7  0.248853  0.282430  0.283122  0.185595          2   2.000000
8  0.012052  0.037406  0.271065  0.679477          3   3.000000

假如現在要找出行第二大的值及其索引時，該怎么操作呢：

解決思路：可以將行的最大值置為0，然后在尋找每行的最大值及其索引。

具體代碼實現過程如下：

#將最大值置為0
array[arr.index,np.argmax(array,axis=1)]=0
array
array([[ 0.        ,  0.23982215,  0.2261405 ,  0.06114962],
       [ 0.        ,  0.11435176,  0.17647322,  0.02947907],
       [ 0.00621393,  0.01652142,  0.31117165,  0.        ],
       [ 0.24093366,  0.23636758,  0.        ,  0.22156043],
       [ 0.        ,  0.2245989 ,  0.24515967,  0.08930501],
       [ 0.05540339,  0.10013942,  0.30361843,  0.        ],
       [ 0.11221886,  0.        ,  0.09237131,  0.03866173],
       [ 0.24885316,  0.28243011,  0.        ,  0.18559511],
       [ 0.01205211,  0.03740638,  0.271065  ,  0.        ]], dtype=float32)
#取出第二大值及其索引
arr['second_value']=array.max(axis=1)
arr['second_index']=np.argmax(array,axis=1)
arr
Out[208]: 
        one       two     three      four  max_value  max_index  second_value  \
0  0.472888  0.239822  0.226140  0.061150   0.472888          0      0.239822   
1  0.679696  0.114352  0.176473  0.029479   0.679696          0      0.176473   
2  0.006214  0.016521  0.311172  0.666093   0.666093          3      0.311172   
3  0.240934  0.236368  0.301138  0.221560   0.301138          2      0.240934   
4  0.440936  0.224599  0.245160  0.089305   0.440936          0      0.245160   
5  0.055403  0.100139  0.303618  0.540839   0.540839          3      0.303618   
6  0.112219  0.756748  0.092371  0.038662   0.756748          1      0.112219   
7  0.248853  0.282430  0.283122  0.185595   0.283122          2      0.282430   
8  0.012052  0.037406  0.271065  0.679477   0.679477          3      0.271065   

   second_index  
0             1  
1             2  
2             2  
3             0  
4             2  
5             2  
6             0  
7             1  
8             2

到此這篇關于pandas求行最大值及其索引的實現的文章就介紹到這了,更多相關pandas求行最大值及索引內容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關文章希望大家以后多多支持腳本之家！

您可能感興趣的文章: