快捷導(dǎo)航

Python 讀取.dat 文件的實(shí)現(xiàn)

更新時(shí)間：2025年04月01日 16:23:25 作者：簡(jiǎn)樸-ocean

這篇文章主要介紹了Python 讀取.dat 文件的實(shí)現(xiàn),文中通過示例代碼介紹的非常詳細(xì),對(duì)大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價(jià)值,需要的朋友們下面隨著小編來一起學(xué)習(xí)學(xué)習(xí)吧

寫在前面

使用matlab可以輸出為 .dat 或者 .mat 形式的文件，之前介紹過讀取 .mat 后綴文件，今天正好把 .dat 的讀取也記錄一下。

讀取方法

這里可以使用pandas庫(kù)將其作為一個(gè)dataframe的形式讀取進(jìn)python，數(shù)據(jù)內(nèi)容格式如下，根據(jù)空格分隔開分別為：

經(jīng)度、緯度、年、月、日、時(shí)、分、秒、變量數(shù)值

0	88.486	10.181	2023.0	3.0	20.0	0.0	15.0	0.0	3329.973
1	88.486	10.181	2023.0	3.0	20.0	0.0	30.0	0.0	3330.019
2	88.486	10.181	2023.0	3.0	20.0	0.0	45.0	0.0	3330.043
3	88.486	10.181	2023.0	3.0	20.0	1.0	 0.0	0.0	3330.077

由于原始的dat文件中是沒有相關(guān)數(shù)據(jù)的信息的，這里為了方便后續(xù)處理，手動(dòng)將其添加上相關(guān)的經(jīng)緯度信息

需要注意的是，在直接將 DataFrame 傳遞給 pd.DataFrame 構(gòu)造函數(shù)并指定列名時(shí)，如果原始 DataFrame 的列數(shù)和新列名的數(shù)量不匹配，可能會(huì)導(dǎo)致數(shù)據(jù)不一致，從而生成 NaN 值。使用 to_numpy() 方法將 DataFrame 轉(zhuǎn)換為 NumPy 數(shù)組可以確保數(shù)據(jù)的一致性，因?yàn)樗鼤?huì)忽略原始列名并僅保留數(shù)據(jù)。

讀取數(shù)據(jù)

import pandas as pd
from datetime  import datetime
import numpy as np
file_path = r'R:/ll/cj_YD_first_bpr_water_level.dat'

df = pd.read_csv(file_path,  header=None,sep=r'\s+')

df

添加經(jīng)緯度信息

df_from_array = pd.DataFrame(df.to_numpy(), columns=['lon', 'lat', 'year', 'month', 'day', 'hour', 'min', 'sec', 'water'])

將時(shí)間提取出來作為新的一列，方便后續(xù)繪圖

df_from_array['datetime'] = df_from_array.apply(lambda row: datetime(year=int(row['year']),
                                             month=int(row['month']),
                                             day=int(row['day']),
                                             hour=int(row['hour']),
                                             minute=int(row['min']),
                                             second=int(row['sec'])),axis=1)
df_from_array

這里，做一個(gè)特殊的預(yù)處理，由于需要時(shí)刻的數(shù)據(jù)是相同的經(jīng)緯度位置的，這里挑選出所有相同經(jīng)緯度坐標(biāo)點(diǎn)的數(shù)據(jù)

grouped = df_from_array.groupby(['lon', 'lat','datetime'])['water'].apply(list).reset_index()

grouped

發(fā)現(xiàn)存在缺測(cè)的站點(diǎn)，剔除掉缺測(cè)的經(jīng)緯度數(shù)據(jù)

grouped = grouped[(grouped['lon'] != -9999.0000) & (grouped['lat'] != -9999.0000)]
grouped['water'] = grouped['water'].apply(lambda x: x[0])
grouped

繪圖

挑選相同站點(diǎn)，不同時(shí)間的數(shù)據(jù)繪制曲線，為了避免不同位置的站點(diǎn)的數(shù)據(jù)大小存在較大差異，設(shè)置不同的y軸來表征

fig, ax1 = plt.subplots(figsize=(15, 10), dpi=200)
plt.rcParams['axes.unicode_minus'] = False
plt.rcParams['font.sans-serif'] = ['Times New Roman']
plt.rcParams['font.size'] = 16
axes = [ax1]
colors = plt.cm.tab10.colors
lines = []  
labels = []  
for i, (_, coord) in enumerate(unique_coords.iterrows()):
    lon = coord['lon']
    lat = coord['lat']
    filtered_data = grouped[(grouped['lon'] == lon) & (grouped['lat'] == lat)]
    
    if i == 0:
        ax = ax1
    else:
        ax = ax1.twinx()
        axes.append(ax)
        ax.spines['right'].set_position(('outward', 80 * (i - 1)))  
    
    color = colors[i % len(colors)]
    line, = ax.plot(filtered_data['datetime'], filtered_data['water'], color=color,
                    linewidth=0.9, linestyle='-', label=f'Lon: {lon}, Lat: {lat}')
    ax.set_ylabel(f' (Lon: {lon}, Lat: {lat})')
    ax.yaxis.label.set_color(color)
    ax.tick_params(axis='y', colors=color)
    
    lines.append(line)
    labels.append(f'Lon: {lon}, Lat: {lat}')
ax1.legend(lines, labels, loc='best',ncols=2, bbox_to_anchor=(0.9, 1))
plt.xticks(rotation=55)
plt.grid()
fig.suptitle('Data Over Time for Different station', y=0.95)
plt.tight_layout()
plt.show()

總結(jié)

復(fù)習(xí)了一下使用pandas讀取.dat文件的相關(guān)函數(shù)，以及pandas的一些基礎(chǔ)命令，繪圖多y軸的方法。相關(guān)數(shù)據(jù)和代碼放到GitHub上

https://github.com/Blissful-Jasper/jianpu_recor

到此這篇關(guān)于詳解用python實(shí)現(xiàn)爬取CSDN熱門評(píng)論URL并存入redis的文章就介紹到這了,更多相關(guān)python爬取URL內(nèi)容請(qǐng)搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家！

您可能感興趣的文章: