Python利用matplotlib畫(huà)出漂亮的分析圖表
前言
作為一名優(yōu)秀的分析師,還是得學(xué)會(huì)一些讓圖表漂亮的技巧,這樣子拿出去才更加有面子哈哈。好了,今天的錦囊就是介紹一下各種常見(jiàn)的圖表,可以怎么來(lái)畫(huà)吧。
數(shù)據(jù)集引入
首先引入數(shù)據(jù)集,我們還用一樣的數(shù)據(jù)集吧,分別是 Salary_Ranges_by_Job_Classification以及 GlobalLandTemperaturesByCity。(具體數(shù)據(jù)集可以后臺(tái)回復(fù) plot獲?。?/p>
# 導(dǎo)入一些常用包
import pandas as pd
import numpy as np
import seaborn as sns
%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib as mpl
plt.style.use('fivethirtyeight')
#解決中文顯示問(wèn)題,Mac
from matplotlib.font_manager import FontProperties
# 查看本機(jī)plt的有效style
print(plt.style.available)
# 根據(jù)本機(jī)available的style,選擇其中一個(gè),因?yàn)橹爸纆gplot很好看,所以我選擇了它
mpl.style.use(['ggplot'])
# ['_classic_test', 'bmh', 'classic', 'dark_background', 'fast', 'fivethirtyeight', 'ggplot', 'grayscale', 'seaborn-bright', 'seaborn-colorblind', 'seaborn-dark-palette', 'seaborn-dark', 'seaborn-darkgrid', 'seaborn-deep', 'seaborn-muted', 'seaborn-notebook', 'seaborn-paper', 'seaborn-pastel', 'seaborn-poster', 'seaborn-talk', 'seaborn-ticks', 'seaborn-white', 'seaborn-whitegrid', 'seaborn', 'Solarize_Light2']
# 數(shù)據(jù)集導(dǎo)入
# 引入第 1 個(gè)數(shù)據(jù)集 Salary_Ranges_by_Job_Classification
salary_ranges = pd.read_csv('./data/Salary_Ranges_by_Job_Classification.csv')
# 引入第 2 個(gè)數(shù)據(jù)集 GlobalLandTemperaturesByCity
climate = pd.read_csv('./data/GlobalLandTemperaturesByCity.csv')
# 移除缺失值
climate.dropna(axis=0, inplace=True)
# 只看中國(guó)
# 日期轉(zhuǎn)換, 將dt 轉(zhuǎn)換為日期,取年份, 注意map的用法
climate['dt'] = pd.to_datetime(climate['dt'])
climate['year'] = climate['dt'].map(lambda value: value.year)
climate_sub_china = climate.loc[climate['Country'] == 'China']
climate_sub_china['Century'] = climate_sub_china['year'].map(lambda x:int(x/100 +1))
climate.head()

折線圖
折線圖是比較簡(jiǎn)單的圖表了,也沒(méi)有什么好優(yōu)化的,顏色看起來(lái)順眼就好了。下面是從網(wǎng)上找到了顏色表,可以從中挑選~

# 選擇上海部分天氣數(shù)據(jù)
df1 = climate.loc[(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\
.loc[:,['dt','AverageTemperature']]\
.set_index('dt')
df1.head()
# 折線圖
df1.plot(colors=['lime'])
plt.title('AverageTemperature Of ShangHai')
plt.ylabel('Number of immigrants')
plt.xlabel('Years')
plt.show()
上面這是單條折線圖,多條折線圖也是可以畫(huà)的,只需要多增加幾列。
# 多條折線圖
df1 = climate.loc[(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\
.loc[:,['dt','AverageTemperature']]\
.rename(columns={'AverageTemperature':'SH'})
df2 = climate.loc[(climate['Country']=='China')&(climate['City']=='Tianjin')&(climate['dt']>='2010-01-01')]\
.loc[:,['dt','AverageTemperature']]\
.rename(columns={'AverageTemperature':'TJ'})
df3 = climate.loc[(climate['Country']=='China')&(climate['City']=='Shenyang')&(climate['dt']>='2010-01-01')]\
.loc[:,['dt','AverageTemperature']]\
.rename(columns={'AverageTemperature':'SY'})
# 合并
df123 = df1.merge(df2, how='inner', on=['dt'])\
.merge(df3, how='inner', on=['dt'])\
.set_index(['dt'])
df123.head()
# 多條折線圖
df123.plot()
plt.title('AverageTemperature Of 3 City')
plt.ylabel('Number of immigrants')
plt.xlabel('Years')
plt.show()
餅圖
接下來(lái)是畫(huà)餅圖,我們可以優(yōu)化的點(diǎn)多了一些,比如說(shuō)從餅塊的分離程度,我們先畫(huà)一個(gè)“低配版”的餅圖。
df1 = salary_ranges.groupby('SetID', axis=0).sum()

# “低配版”餅圖
df1['Step'].plot(kind='pie', figsize=(7,7),
autopct='%1.1f%%',
shadow=True)
plt.axis('equal')
plt.show()
# “高配版”餅圖
colors = ['lightgreen', 'lightblue'] #控制餅圖顏色 ['lightgreen', 'lightblue', 'pink', 'purple', 'grey', 'gold']
explode=[0, 0.2] #控制餅圖分離狀態(tài),越大越分離
df1['Step'].plot(kind='pie', figsize=(7, 7),
autopct = '%1.1f%%', startangle=90,
shadow=True, labels=None, pctdistance=1.12, colors=colors, explode = explode)
plt.axis('equal')
plt.legend(labels=df1.index, loc='upper right', fontsize=14)
plt.show()
散點(diǎn)圖
散點(diǎn)圖可以優(yōu)化的地方比較少了,ggplot2的配色都蠻好看的,正所謂style選的好,省很多功夫!
# 選擇上海部分天氣數(shù)據(jù)
df1 = climate.loc[(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\
.loc[:,['dt','AverageTemperature']]\
.rename(columns={'AverageTemperature':'SH'})
df2 = climate.loc[(climate['Country']=='China')&(climate['City']=='Shenyang')&(climate['dt']>='2010-01-01')]\
.loc[:,['dt','AverageTemperature']]\
.rename(columns={'AverageTemperature':'SY'})
# 合并
df12 = df1.merge(df2, how='inner', on=['dt'])
df12.head()
# 散點(diǎn)圖
df12.plot(kind='scatter', x='SH', y='SY', figsize=(10, 6), color='darkred')
plt.title('Average Temperature Between ShangHai - ShenYang')
plt.xlabel('ShangHai')
plt.ylabel('ShenYang')
plt.show()
面積圖
# 多條折線圖
df1 = climate.loc[(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\
.loc[:,['dt','AverageTemperature']]\
.rename(columns={'AverageTemperature':'SH'})
df2 = climate.loc[(climate['Country']=='China')&(climate['City']=='Tianjin')&(climate['dt']>='2010-01-01')]\
.loc[:,['dt','AverageTemperature']]\
.rename(columns={'AverageTemperature':'TJ'})
df3 = climate.loc[(climate['Country']=='China')&(climate['City']=='Shenyang')&(climate['dt']>='2010-01-01')]\
.loc[:,['dt','AverageTemperature']]\
.rename(columns={'AverageTemperature':'SY'})
# 合并
df123 = df1.merge(df2, how='inner', on=['dt'])\
.merge(df3, how='inner', on=['dt'])\
.set_index(['dt'])
df123.head()
colors = ['red', 'pink', 'blue'] #控制餅圖顏色 ['lightgreen', 'lightblue', 'pink', 'purple', 'grey', 'gold']
df123.plot(kind='area', stacked=False,
figsize=(20, 10), colors=colors)
plt.title('AverageTemperature Of 3 City')
plt.ylabel('AverageTemperature')
plt.xlabel('Years')
plt.show()
直方圖
# 選擇上海部分天氣數(shù)據(jù)
df = climate.loc[(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\
.loc[:,['dt','AverageTemperature']]\
.set_index('dt')
df.head()
# 最簡(jiǎn)單的直方圖
df['AverageTemperature'].plot(kind='hist', figsize=(8,5), colors=['grey'])
plt.title('ShangHai AverageTemperature Of 2010-2013') # add a title to the histogram
plt.ylabel('Number of month') # add y-label
plt.xlabel('AverageTemperature') # add x-label
plt.show()
條形圖
# 選擇上海部分天氣數(shù)據(jù)
df = climate.loc[(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\
.loc[:,['dt','AverageTemperature']]\
.set_index('dt')
df.head()
df.plot(kind='bar', figsize = (10, 6))
plt.xlabel('Month')
plt.ylabel('AverageTemperature')
plt.title('AverageTemperature of shanghai')
plt.show()
df.plot(kind='barh', figsize=(12, 16), color='steelblue')
plt.xlabel('AverageTemperature')
plt.ylabel('Month')
plt.title('AverageTemperature of shanghai')
plt.show()
到此這篇關(guān)于Python利用matplotlib畫(huà)出漂亮的分析圖表的文章就介紹到這了,更多相關(guān)Python 繪制分析圖表內(nèi)容請(qǐng)搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家!
相關(guān)文章
用Python調(diào)用win命令行提高工作效率的實(shí)例
今天小編就為大家分享一篇用Python調(diào)用win命令行提高工作效率的實(shí)例,具有很好的參考價(jià)值,希望對(duì)大家有所幫助。一起跟隨小編過(guò)來(lái)看看吧2019-08-08
利用Python進(jìn)行數(shù)據(jù)清洗的操作指南
數(shù)據(jù)清洗是指發(fā)現(xiàn)并糾正數(shù)據(jù)文件中可識(shí)別的錯(cuò)誤的最后一道程序,包括檢查數(shù)據(jù)一致性,處理無(wú)效值和缺失值等。本文為大家介紹了Python進(jìn)行數(shù)據(jù)清洗的操作詳解,需要的可以參考一下2022-03-03
Python paramiko 模塊淺談與SSH主要功能模擬解析
這篇文章主要介紹了Python paramiko 模塊詳解與SSH主要功能模擬,本文通過(guò)圖文并茂的形式給大家介紹的非常詳細(xì),具有一定的參考借鑒價(jià)值,需要的朋友可以參考下2020-02-02
Python DataFrame 設(shè)置輸出不顯示index(索引)值的方法

