Python Pandas讀取Excel數(shù)據(jù)并根據(jù)時間字段篩選數(shù)據(jù)

更新時間：2025年07月22日 10:05:06 作者：岫珩

這篇文章主要為大家詳細(xì)介紹了Python如何調(diào)用Pandas實現(xiàn)讀取Excel數(shù)據(jù)并根據(jù)時間字段篩選數(shù)據(jù),感興趣的小伙伴可以跟隨小編一起學(xué)習(xí)一下

1. 需求描述

現(xiàn)在有一個excel表格，其中包含設(shè)備字段device_id、最后使用時間字段end_time以及其他字段若干

需要將表格中的每個設(shè)備對應(yīng)的最新的使用時間篩選出來，并在結(jié)果中根據(jù)最新時間篩選出4月和5月

對應(yīng)的設(shè)備號列表

2. 讀取excel表格

import pandas as pd

# 讀取 Excel 文件
file_path = r"C:\Users\Downloads\file_record.xlsx"  # 替換為你的文件路徑
df = pd.read_excel(file_path)
# 顯示前幾行數(shù)據(jù)
# print(df.head())
# print(df)

3. 篩選最新時間

先根據(jù)時間重置DataFrame對象

# Assuming 'df' is your DataFrame and 'end_time' is initially in string format
df['end_time'] = pd.to_datetime(df['end_time'])  # Convert to datetime if necessary

然后根據(jù)設(shè)備號分組，再取end_time中最新即最大時間值，并重置索引

# Group by 'device_id' and find the max (latest) 'end_time' for each group
latest_end_times = df.groupby('device_id')['end_time'].max().reset_index()

4. 篩選具體月份數(shù)據(jù)

在上面的最新時間中篩選出4月和5月的設(shè)備列表

# Filter the 'latest_end_times' DataFrame to only include devices with 'end_time' in April or May
filtered_devices = latest_end_times[
    (latest_end_times['end_time'].dt.month == 4) | 
    (latest_end_times['end_time'].dt.month == 5)
]

5.輸出結(jié)果

遍歷結(jié)果中設(shè)備和時間信息

for index, row in filtered_devices.iterrows():
    device_id = row['device_id']
    latest_end_time = row['end_time']
    print(f"Device ID: {device_id}, Latest End Time: {latest_end_time}")


# 'filtered_devices' now contains the device information for which the latest 'end_time' is in April or May

6. 完整代碼

完整代碼如下

import pandas as pd

# 讀取 Excel 文件
file_path = r"C:\Users\Downloads\file_record.xlsx"  # 替換為你的文件路徑
df = pd.read_excel(file_path)

# 顯示前幾行數(shù)據(jù)
# print(df.head())
# print(df)

# Assuming 'df' is your DataFrame and 'end_time' is initially in string format
df['end_time'] = pd.to_datetime(df['end_time'])  # Convert to datetime if necessary
# print(df.head())

# Group by 'device_id' and find the max (latest) 'end_time' for each group
latest_end_times = df.groupby('device_id')['end_time'].max().reset_index()
# print(df)


# Filter the 'latest_end_times' DataFrame to only include devices with 'end_time' in April or May
filtered_devices = latest_end_times[
    (latest_end_times['end_time'].dt.month == 4) | 
    (latest_end_times['end_time'].dt.month == 5)
]

for index, row in filtered_devices.iterrows():
    device_id = row['device_id']
    latest_end_time = row['end_time']
    print(f"Device ID: {device_id}, Latest End Time: {latest_end_time}")


# 'filtered_devices' now contains the device information for which the latest 'end_time' is in April or May

到此這篇關(guān)于Python Pandas讀取Excel數(shù)據(jù)并根據(jù)時間字段篩選數(shù)據(jù)的文章就介紹到這了,更多相關(guān)Pandas讀取Excel數(shù)據(jù)內(nèi)容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家！

您可能感興趣的文章: