快捷導(dǎo)航

使用wxPython和Pandas實(shí)現(xiàn)XLSX分析器和網(wǎng)頁打開器

更新時(shí)間：2024年10月21日 10:41:07 作者：winfredzhang

這篇文章主要為大家詳細(xì)介紹了如何使用wxPython和Pandas實(shí)現(xiàn)XLSX分析器和網(wǎng)頁打開器,文中的示例代碼講解詳細(xì),感興趣的可以了解一下

在本文中，我們將分析一個(gè)使用 wxPython 和 Pandas 庫編寫的 Python 應(yīng)用程序，名為 “XLSX Analyzer and Web Opener”。該應(yīng)用程序的核心功能是：從 Excel 文件中讀取數(shù)據(jù)并顯示在網(wǎng)格中，此外，還允許用戶使用 Google Chrome 批量打開 Excel 文件中的 URL 列表。

C:\pythoncode\new\analysisxlsx.py

全部代碼

import wx
import wx.grid
import pandas as pd
import subprocess
import os

CHROME_PATH = r"C:\Program Files\Google\Chrome\Application\chrome.exe"

class XlsxAnalyzerFrame(wx.Frame):
    def __init__(self):
        super().__init__(parent=None, title='XLSX Analyzer and Web Opener', size=(1200, 800))
        panel = wx.Panel(self)

        main_sizer = wx.BoxSizer(wx.VERTICAL)

        self.file_picker = wx.FilePickerCtrl(panel, wildcard="Excel files (*.xlsx)|*.xlsx")
        self.file_picker.Bind(wx.EVT_FILEPICKER_CHANGED, self.on_file_selected)
        main_sizer.Add(self.file_picker, 0, wx.ALL | wx.EXPAND, 10)

        self.grid = wx.grid.Grid(panel)
        main_sizer.Add(self.grid, 1, wx.ALL | wx.EXPAND, 10)

        open_button = wx.Button(panel, label='Open URLs in Chrome')
        open_button.Bind(wx.EVT_BUTTON, self.on_open_urls)
        main_sizer.Add(open_button, 0, wx.ALL | wx.CENTER, 10)

        panel.SetSizer(main_sizer)
        self.Layout()
        self.Show()

        self.grid_created = False

    def on_file_selected(self, event):
        file_path = self.file_picker.GetPath()
        if file_path:
            try:
                df = pd.read_excel(file_path, sheet_name='sheet1')
                expected_columns = [
                    "blog-list-box href", "course-img src", "blog-list-box-top", 
                    "blog-list-content", "article-type", "view-time-box", "view-num", 
                    "give-like-num", "comment-num", "comment-num 2", "btn-edit-article href"
                ]
                if not all(col in df.columns for col in expected_columns):
                    raise ValueError("Excel file does not contain all expected columns")
                self.update_grid(df)
            except Exception as e:
                wx.MessageBox(f'Error reading file: {str(e)}', 'Error', wx.OK | wx.ICON_ERROR)

    def update_grid(self, df):
        if not self.grid_created:
            self.grid.CreateGrid(df.shape[0], df.shape[1])
            self.grid_created = True
        else:
            current_rows = self.grid.GetNumberRows()
            current_cols = self.grid.GetNumberCols()
            
            if current_rows < df.shape[0]:
                self.grid.AppendRows(df.shape[0] - current_rows)
            elif current_rows > df.shape[0]:
                self.grid.DeleteRows(0, current_rows - df.shape[0])
            
            if current_cols < df.shape[1]:
                self.grid.AppendCols(df.shape[1] - current_cols)
            elif current_cols > df.shape[1]:
                self.grid.DeleteCols(0, current_cols - df.shape[1])

        for i, col in enumerate(df.columns):
            self.grid.SetColLabelValue(i, str(col))
            for j, val in enumerate(df[col]):
                self.grid.SetCellValue(j, i, str(val))

        self.grid.AutoSizeColumns()
        self.grid.ForceRefresh()
        self.Layout()

    def get_urls(self):
        if self.grid.GetNumberRows() == 0:
            wx.MessageBox('No data loaded', 'Error', wx.OK | wx.ICON_ERROR)
            return []

        try:
            url_col_index = next(i for i in range(self.grid.GetNumberCols()) if "blog-list-box href" in self.grid.GetColLabelValue(i))
            return [self.grid.GetCellValue(row, url_col_index) for row in range(self.grid.GetNumberRows()) if self.grid.GetCellValue(row, url_col_index).strip()]
        except StopIteration:
            wx.MessageBox('Could not find "blog-list-box href" column', 'Error', wx.OK | wx.ICON_ERROR)
            return []

    def on_open_urls(self, event):
        if not os.path.exists(CHROME_PATH):
            wx.MessageBox(f'Chrome executable not found at {CHROME_PATH}', 'Error', wx.OK | wx.ICON_ERROR)
            return

        urls = self.get_urls()
        if not urls:
            return

        for i in range(0, len(urls), 10):
            batch = urls[i:i+10]
            for url in batch:
                try:
                    subprocess.Popen([CHROME_PATH, url])
                except Exception as e:
                    wx.MessageBox(f'Error opening URL {url}: {str(e)}', 'Error', wx.OK | wx.ICON_ERROR)
            
            if i + 10 < len(urls):
                should_continue = wx.MessageBox('Open next 10 URLs?', 'Continue',
                                                wx.YES_NO | wx.ICON_QUESTION)
                if should_continue == wx.NO:
                    break

if __name__ == '__main__':
    app = wx.App()
    frame = XlsxAnalyzerFrame()
    app.MainLoop()

核心功能概述

1.選擇并解析 XLSX 文件：用戶通過文件選擇器選擇一個(gè) Excel 文件，程序讀取其中的數(shù)據(jù)，并在網(wǎng)格中顯示。

2.批量打開 URL：如果 Excel 文件包含一個(gè) URL 列，用戶可以點(diǎn)擊按鈕，程序會(huì)批量使用 Chrome 打開這些 URL。

3.錯(cuò)誤處理：當(dāng)文件不符合預(yù)期格式，Chrome 瀏覽器不可用或打開 URL 失敗時(shí)，程序會(huì)顯示相應(yīng)的錯(cuò)誤消息。

導(dǎo)入的庫

import wx
import wx.grid
import pandas as pd
import subprocess
import os

wx 和 wx.grid：用于創(chuàng)建圖形用戶界面（GUI），包括窗口、文件選擇器、按鈕和數(shù)據(jù)網(wǎng)格。

pandas (pd)：用于從 Excel 文件中讀取數(shù)據(jù)，并處理這些數(shù)據(jù)以顯示在 GUI 網(wǎng)格中。

subprocess：用于通過系統(tǒng)命令啟動(dòng) Chrome 瀏覽器。

os：用于檢查 Chrome 瀏覽器的路徑是否存在。

Google Chrome 路徑

CHROME_PATH = r"C:\Program Files\Google\Chrome\Application\chrome.exe"

該常量存儲(chǔ)了 Chrome 瀏覽器的路徑，程序?qū)⑹褂眠@個(gè)路徑來啟動(dòng) Chrome。如果用戶的系統(tǒng)上 Chrome 位于不同的路徑，需要修改該值。

類 XlsxAnalyzerFrame

主框架類 XlsxAnalyzerFrame 繼承自 wx.Frame，實(shí)現(xiàn)了應(yīng)用的 GUI 和邏輯。下面是它的初始化部分：

class XlsxAnalyzerFrame(wx.Frame):
    def __init__(self):
        super().__init__(parent=None, title='XLSX Analyzer and Web Opener', size=(1200, 800))
        panel = wx.Panel(self)

        main_sizer = wx.BoxSizer(wx.VERTICAL)

        self.file_picker = wx.FilePickerCtrl(panel, wildcard="Excel files (*.xlsx)|*.xlsx")
        self.file_picker.Bind(wx.EVT_FILEPICKER_CHANGED, self.on_file_selected)
        main_sizer.Add(self.file_picker, 0, wx.ALL | wx.EXPAND, 10)

        self.grid = wx.grid.Grid(panel)
        main_sizer.Add(self.grid, 1, wx.ALL | wx.EXPAND, 10)

        open_button = wx.Button(panel, label='Open URLs in Chrome')
        open_button.Bind(wx.EVT_BUTTON, self.on_open_urls)
        main_sizer.Add(open_button, 0, wx.ALL | wx.CENTER, 10)

        panel.SetSizer(main_sizer)
        self.Layout()
        self.Show()

        self.grid_created = False

界面元素：

文件選擇器 (self.file_picker)：允許用戶選擇 Excel 文件，并綁定 on_file_selected 事件處理函數(shù)。當(dāng)用戶選擇文件時(shí)，該函數(shù)將解析并加載數(shù)據(jù)。

數(shù)據(jù)網(wǎng)格 (self.grid)：這是用于顯示 Excel 文件數(shù)據(jù)的表格。wx.grid.Grid 是 wxPython 提供的網(wǎng)格控件，允許顯示類似 Excel 的數(shù)據(jù)表。

打開 URL 按鈕 (open_button)：該按鈕用于批量打開 Excel 文件中的 URL。當(dāng)用戶點(diǎn)擊按鈕時(shí)，on_open_urls 事件處理函數(shù)會(huì)處理并打開這些 URL。

處理 Excel 文件

讀取并加載 Excel 數(shù)據(jù)

當(dāng)用戶選擇一個(gè) Excel 文件時(shí)，觸發(fā) on_file_selected 事件：

def on_file_selected(self, event):
    file_path = self.file_picker.GetPath()
    if file_path:
        try:
            df = pd.read_excel(file_path, sheet_name='sheet1')
            expected_columns = [
                "blog-list-box href", "course-img src", "blog-list-box-top", 
                "blog-list-content", "article-type", "view-time-box", "view-num", 
                "give-like-num", "comment-num", "comment-num 2", "btn-edit-article href"
            ]
            if not all(col in df.columns for col in expected_columns):
                raise ValueError("Excel file does not contain all expected columns")
            self.update_grid(df)
        except Exception as e:
            wx.MessageBox(f'Error reading file: {str(e)}', 'Error', wx.OK | wx.ICON_ERROR)

file_path = self.file_picker.GetPath()：獲取用戶選擇的文件路徑。
pd.read_excel()：使用 Pandas 從 Excel 文件中讀取數(shù)據(jù)。程序假定數(shù)據(jù)位于名為 'sheet1' 的工作表中。
expected_columns：指定預(yù)期的列名。如果 Excel 文件不包含所有這些列，程序會(huì)拋出異常并顯示錯(cuò)誤消息。

更新數(shù)據(jù)網(wǎng)格

數(shù)據(jù)成功加載后，通過 update_grid 函數(shù)將數(shù)據(jù)更新到網(wǎng)格中：

def update_grid(self, df):
    if not self.grid_created:
        self.grid.CreateGrid(df.shape[0], df.shape[1])
        self.grid_created = True
    else:
        current_rows = self.grid.GetNumberRows()
        current_cols = self.grid.GetNumberCols()
        
        if current_rows < df.shape[0]:
            self.grid.AppendRows(df.shape[0] - current_rows)
        elif current_rows > df.shape[0]:
            self.grid.DeleteRows(0, current_rows - df.shape[0])
        
        if current_cols < df.shape[1]:
            self.grid.AppendCols(df.shape[1] - current_cols)
        elif current_cols > df.shape[1]:
            self.grid.DeleteCols(0, current_cols - df.shape[1])

    for i, col in enumerate(df.columns):
        self.grid.SetColLabelValue(i, str(col))
        for j, val in enumerate(df[col]):
            self.grid.SetCellValue(j, i, str(val))

    self.grid.AutoSizeColumns()
    self.grid.ForceRefresh()
    self.Layout()

該函數(shù)根據(jù) Excel 文件的行數(shù)和列數(shù)動(dòng)態(tài)調(diào)整網(wǎng)格大小，并逐行逐列填充數(shù)據(jù)。

批量打開 URL

程序從 Excel 文件中獲取一個(gè)名為 "blog-list-box href" 的列，用戶可以點(diǎn)擊按鈕，程序會(huì)逐批打開這些 URL。每次打開 10 個(gè) URL，并詢問用戶是否繼續(xù)：

def on_open_urls(self, event):
    if not os.path.exists(CHROME_PATH):
        wx.MessageBox(f'Chrome executable not found at {CHROME_PATH}', 'Error', wx.OK | wx.ICON_ERROR)
        return

    urls = self.get_urls()
    if not urls:
        return

    for i in range(0, len(urls), 10):
        batch = urls[i:i+10]
        for url in batch:
            try:
                subprocess.Popen([CHROME_PATH, url])
            except Exception as e:
                wx.MessageBox(f'Error opening URL {url}: {str(e)}', 'Error', wx.OK | wx.ICON_ERROR)
        
        if i + 10 < len(urls):
            should_continue = wx.MessageBox('Open next 10 URLs?', 'Continue',
                                            wx.YES_NO | wx.ICON_QUESTION)
            if should_continue == wx.NO:
                break

核心步驟：

檢查 Chrome 路徑：首先檢查 Chrome 瀏覽器是否存在于指定路徑中。

獲取 URL 列表：調(diào)用 get_urls 函數(shù)，提取網(wǎng)格中的 URL 列表。

分批打開 URL：使用 subprocess.Popen 啟動(dòng) Chrome 并打開這些 URL。每次打開 10 個(gè) URL，并詢問用戶是否繼續(xù)打開下一個(gè) 10 個(gè) URL。

運(yùn)行結(jié)果

總結(jié)

此程序?qū)崿F(xiàn)了通過 Excel 文件進(jìn)行數(shù)據(jù)分析，并能夠批量打開其中的 URL。它結(jié)合了 wxPython 用于構(gòu)建 GUI、Pandas 用于處理 Excel 數(shù)據(jù)，以及 subprocess 來控制系統(tǒng)程序。程序還包含基本的錯(cuò)誤處理和用戶交互提示，適合在需要從表格數(shù)據(jù)中提取和操作 URL 的場(chǎng)景下使用。

以上就是使用wxPython和Pandas實(shí)現(xiàn)XLSX分析器和網(wǎng)頁打開器的詳細(xì)內(nèi)容，更多關(guān)于wxPython Pandas XLSX分析器的資料請(qǐng)關(guān)注腳本之家其它相關(guān)文章！

您可能感興趣的文章: