Pandas+openpyxl進行Excel處理詳解

更新時間：2025年02月07日 08:42:59 作者：victor66

這篇文章主要為大家詳細介紹了如何使用pandas和openpyxl庫對多個Excel文件進行多種處理的方法,文中的示例代碼講解詳細,感興趣的小伙伴可以了解下

1. 讀取多個 Excel 文件并合并
2. 批量處理多個 Excel 文件
3. 從多個 Excel 文件中提取特定信息
4. 使用 openpyxl 處理多個 Excel 文件
5. 合并多個 Excel 文件到一個工作簿的不同工作表
6. 批量處理多個 Excel 文件并進行數(shù)據(jù)清洗
7. 從多個 Excel 文件中提取特定列并合并
8. 批量重命名多個 Excel 文件中的工作表
9. 批量導出 Excel 數(shù)據(jù)到 CSV 文件
10. 批量處理多個 Excel 文件并進行數(shù)據(jù)分析

1. 讀取多個 Excel 文件并合并

假設你有一個文件夾，里面包含多個 Excel 文件，你想將這些文件合并成一個 DataFrame。

import pandas as pd
import os
# 文件夾路徑
folder_path = 'path/to/your/excel/files'
# 獲取文件夾中的所有 Excel 文件
excel_files = [f for f in os.listdir(folder_path) if f.endswith('.xlsx') or f.endswith('.xls')]
# 創(chuàng)建一個空的 DataFrame 來存儲所有數(shù)據(jù)
all_data = pd.DataFrame()
# 逐個讀取每個 Excel 文件并將數(shù)據(jù)追加到 all_data 中
for file in excel_files:
    file_path = os.path.join(folder_path, file)
    df = pd.read_excel(file_path)
    all_data = pd.concat([all_data, df], ignore_index=True)
# 查看合并后的數(shù)據(jù)
print(all_data.head())

2. 批量處理多個 Excel 文件

假設你需要對多個 Excel 文件進行相同的處理（例如，添加一列、過濾數(shù)據(jù)等）。

import pandas as pd
import os
# 文件夾路徑
folder_path = 'path/to/your/excel/files'
output_folder = 'path/to/output/folder'
# 確保輸出文件夾存在
os.makedirs(output_folder, exist_ok=True)
# 獲取文件夾中的所有 Excel 文件
excel_files = [f for f in os.listdir(folder_path) if f.endswith('.xlsx') or f.endswith('.xls')]
# 處理每個 Excel 文件
for file in excel_files:
    file_path = os.path.join(folder_path, file)
    df = pd.read_excel(file_path)
    # 添加一列
    df['New_Column'] = 'Some Value'
    # 過濾數(shù)據(jù)
    filtered_df = df[df['Some_Column'] > 100]
    # 保存處理后的數(shù)據(jù)
    output_file_path = os.path.join(output_folder, file)
    filtered_df.to_excel(output_file_path, index=False)
print("Processing complete.")

3. 從多個 Excel 文件中提取特定信息

假設你需要從多個 Excel 文件中提取特定的信息（例如，某個特定單元格的數(shù)據(jù)）。

import pandas as pd
import os
# 文件夾路徑
folder_path = 'path/to/your/excel/files'
# 獲取文件夾中的所有 Excel 文件
excel_files = [f for f in os.listdir(folder_path) if f.endswith('.xlsx') or f.endswith('.xls')]
# 存儲結果
results = []
# 從每個 Excel 文件中提取特定信息
for file in excel_files:
    file_path = os.path.join(folder_path, file)
    df = pd.read_excel(file_path)
    # 假設我們需要提取第一行第一列的數(shù)據(jù)
    specific_value = df.iloc[0, 0]
    # 將結果存儲在一個列表中
    results.append((file, specific_value))
# 打印結果
for file, value in results:
    print(f"File: {file}, Specific Value: {value}")

4. 使用 openpyxl 處理多個 Excel 文件

如果你需要更細粒度地控制 Excel 文件（例如，修改特定單元格、格式化等），可以使用 openpyxl 庫。

import openpyxl
import os
# 文件夾路徑
folder_path = 'path/to/your/excel/files'
output_folder = 'path/to/output/folder'
# 確保輸出文件夾存在
os.makedirs(output_folder, exist_ok=True)
# 獲取文件夾中的所有 Excel 文件
excel_files = [f for f in os.listdir(folder_path) if f.endswith('.xlsx') or f.endswith('.xls')]
# 處理每個 Excel 文件
for file in excel_files:
    file_path = os.path.join(folder_path, file)
    workbook = openpyxl.load_workbook(file_path)
    sheet = workbook.active
    # 修改特定單元格
    sheet['A1'] = 'New Value'
    # 保存處理后的文件
    output_file_path = os.path.join(output_folder, file)
    workbook.save(output_file_path)
print("Processing complete.")

5. 合并多個 Excel 文件到一個工作簿的不同工作表

假設你有多個 Excel 文件，并希望將它們合并到一個新的 Excel 工作簿中的不同工作表中。

import pandas as pd
import os
# 文件夾路徑
folder_path = 'path/to/your/excel/files'
output_file = 'merged_workbook.xlsx'
# 獲取文件夾中的所有 Excel 文件
excel_files = [f for f in os.listdir(folder_path) if f.endswith('.xlsx') or f.endswith('.xls')]
# 創(chuàng)建一個新的 ExcelWriter 對象
with pd.ExcelWriter(output_file, engine='openpyxl') as writer:
    # 處理每個 Excel 文件并將數(shù)據(jù)寫入不同的工作表
    for file in excel_files:
        file_path = os.path.join(folder_path, file)
        df = pd.read_excel(file_path)
        # 使用文件名作為工作表名稱
        sheet_name = os.path.splitext(file)[0]
        # 寫入數(shù)據(jù)
        df.to_excel(writer, sheet_name=sheet_name, index=False)
print("Merging complete.")

6. 批量處理多個 Excel 文件并進行數(shù)據(jù)清洗

假設你需要對多個 Excel 文件進行數(shù)據(jù)清洗，例如刪除空行、填充缺失值等。

import pandas as pd
import os
# 文件夾路徑
folder_path = 'path/to/your/excel/files'
output_folder = 'path/to/output/folder'
# 確保輸出文件夾存在
os.makedirs(output_folder, exist_ok=True)
# 獲取文件夾中的所有 Excel 文件
excel_files = [f for f in os.listdir(folder_path) if f.endswith('.xlsx') or f.endswith('.xls')]
# 處理每個 Excel 文件
for file in excel_files:
    file_path = os.path.join(folder_path, file)
    df = pd.read_excel(file_path)
    # 刪除空行
    df.dropna(how='all', inplace=True)
    # 填充缺失值
    df.fillna(0, inplace=True)
    # 保存處理后的數(shù)據(jù)
    output_file_path = os.path.join(output_folder, file)
    df.to_excel(output_file_path, index=False)
print("Data cleaning complete.")

7. 從多個 Excel 文件中提取特定列并合并

假設你需要從多個 Excel 文件中提取特定列，并將這些列合并成一個新的 DataFrame。

import pandas as pd
import os
# 文件夾路徑
folder_path = 'path/to/your/excel/files'
# 獲取文件夾中的所有 Excel 文件
excel_files = [f for f in os.listdir(folder_path) if f.endswith('.xlsx') or f.endswith('.xls')]
# 創(chuàng)建一個空的 DataFrame 來存儲所有數(shù)據(jù)
all_data = pd.DataFrame()
# 逐個讀取每個 Excel 文件并提取特定列
for file in excel_files:
    file_path = os.path.join(folder_path, file)
    df = pd.read_excel(file_path, usecols=['Column1', 'Column2'])
    # 將提取的數(shù)據(jù)追加到 all_data 中
    all_data = pd.concat([all_data, df], ignore_index=True)
# 查看合并后的數(shù)據(jù)
print(all_data.head())

8. 批量重命名多個 Excel 文件中的工作表

假設你需要批量重命名多個 Excel 文件中的工作表名稱。

import openpyxl
import os
# 文件夾路徑
folder_path = 'path/to/your/excel/files'
output_folder = 'path/to/output/folder'
# 確保輸出文件夾存在
os.makedirs(output_folder, exist_ok=True)
# 獲取文件夾中的所有 Excel 文件
excel_files = [f for f in os.listdir(folder_path) if f.endswith('.xlsx') or f.endswith('.xls')]
# 處理每個 Excel 文件
for file in excel_files:
    file_path = os.path.join(folder_path, file)
    workbook = openpyxl.load_workbook(file_path)
    # 重命名工作表
    if 'OldSheetName' in workbook.sheetnames:
        sheet = workbook['OldSheetName']
        sheet.title = 'NewSheetName'
    # 保存處理后的文件
    output_file_path = os.path.join(output_folder, file)
    workbook.save(output_file_path)
print("Sheet renaming complete.")

9. 批量導出 Excel 數(shù)據(jù)到 CSV 文件

假設你需要將多個 Excel 文件中的數(shù)據(jù)批量導出為 CSV 文件。

import pandas as pd
import os
# 文件夾路徑
folder_path = 'path/to/your/excel/files'
output_folder = 'path/to/output/csvs'
# 確保輸出文件夾存在
os.makedirs(output_folder, exist_ok=True)
# 獲取文件夾中的所有 Excel 文件
excel_files = [f for f in os.listdir(folder_path) if f.endswith('.xlsx') or f.endswith('.xls')]
# 處理每個 Excel 文件
for file in excel_files:
    file_path = os.path.join(folder_path, file)
    df = pd.read_excel(file_path)
    # 生成輸出文件路徑
    base_name = os.path.splitext(file)[0]
    output_file_path = os.path.join(output_folder, f'{base_name}.csv')
    # 導出為 CSV 文件
    df.to_csv(output_file_path, index=False)
print("Export to CSV complete.")

10. 批量處理多個 Excel 文件并進行數(shù)據(jù)分析

假設你需要對多個 Excel 文件進行數(shù)據(jù)分析，例如計算總和、平均值等。

import pandas as pd
import os
# 文件夾路徑
folder_path = 'path/to/your/excel/files'
# 獲取文件夾中的所有 Excel 文件
excel_files = [f for f in os.listdir(folder_path) if f.endswith('.xlsx') or f.endswith('.xls')]
# 創(chuàng)建一個空的 DataFrame 來存儲所有數(shù)據(jù)
all_data = pd.DataFrame()
# 逐個讀取每個 Excel 文件并將數(shù)據(jù)追加到 all_data 中
for file in excel_files:
    file_path = os.path.join(folder_path, file)
    df = pd.read_excel(file_path)
    # 將數(shù)據(jù)追加到 all_data 中
    all_data = pd.concat([all_data, df], ignore_index=True)
# 進行數(shù)據(jù)分析
total_sum = all_data['Some_Column'].sum()
average_value = all_data['Some_Column'].mean()
# 打印結果
print(f"Total Sum: {total_sum}")
print(f"Average Value: {average_value}")

到此這篇關于Pandas+openpyxl進行Excel處理詳解的文章就介紹到這了,更多相關Pandas openpyxl處理Excel內容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關文章希望大家以后多多支持腳本之家！

您可能感興趣的文章:

python列表的常用操作方法小結
這篇文章主要為大家詳細介紹了python字典的常用操作方法，主要內容包含Python中列表(List)的詳解操作方法,包含創(chuàng)建、訪問、更新、刪除、其它操作等,需要的朋友可以參考下
2016-05-05
Python3.6.x中內置函數(shù)總結及講解
今天小編就為大家分享一篇關于Python3.6.x中內置函數(shù)總結及講解，小編覺得內容挺不錯的，現(xiàn)在分享給大家，具有很好的參考價值，需要的朋友一起跟隨小編來看看吧
2019-02-02
Flask實現(xiàn)定制日志并輸出到文件
這篇文章主要為大家學習介紹了Flask如何實現(xiàn)定制日志并輸出到文件，文中的示例代碼簡介易懂，感興趣的小伙伴快跟隨小編一起學習一下吧
2023-07-07
通過實例解析python描述符原理作用
這篇文章主要介紹了通過實例解析python描述符原理作用,文中通過示例代碼介紹的非常詳細，對大家的學習或者工作具有一定的參考學習價值,需要的朋友可以參考下
2020-01-01
python編寫樸素貝葉斯用于文本分類
這篇文章主要為大家詳細介紹了python編寫樸素貝葉斯用于文本分類，具有一定的參考價值，感興趣的小伙伴們可以參考一下
2017-12-12
PyQt5編程擴展之資源文件的使用教程
PyQt5支持Qt的資源系統(tǒng),這是用于在應用程序中嵌入圖片和翻譯文件等資源的工具,下面這篇文章主要給大家介紹了關于PyQt5編程擴展之資源文件使用的相關資料,文中通過圖文介紹的非常詳細,需要的朋友可以參考下
2023-03-03
python golang中grpc 使用示例代碼詳解
這篇文章主要介紹了python golang中grpc 使用,本文通過示例代碼給大家介紹的非常詳細，對大家的學習或工作具有一定的參考借鑒價值，需要的朋友可以參考下
2020-06-06
$Python之DataFrame輸出為csv\txt\xlsx文件問題$
Python之DataFrame輸出為csv\txt\xlsx文件問題
這篇文章主要介紹了Python之DataFrame輸出為csv\txt\xlsx文件問題,具有很好的參考價值,希望對大家有所幫助,如有錯誤或未考慮完全的地方,望不吝賜教
2023-08-08
Python批量添加水印的優(yōu)雅實現(xiàn)與進階
在日常圖像處理中,為圖片添加水印是一項常見任務,有多種方法和工具可供選擇,本文將專注于使用Python語言結合PIL庫批量添加水印,感興趣的可以了解下
2023-12-12
Pytorch 定義MyDatasets實現(xiàn)多通道分別輸入不同數(shù)據(jù)方式
今天小編就為大家分享一篇Pytorch 定義MyDatasets實現(xiàn)多通道分別輸入不同數(shù)據(jù)方式，具有很好的參考價值，希望對大家有所幫助。一起跟隨小編過來看看吧
2020-01-01