Pytorch可視化(顯示圖片)及格式轉(zhuǎn)換問題

更新時間：2022年12月13日 10:16:43 作者：向bug低頭。

這篇文章主要介紹了Pytorch可視化(顯示圖片)及格式轉(zhuǎn)換問題，具有很好的參考價值，希望對大家有所幫助。如有錯誤或未考慮完全的地方，望不吝賜教

讀取RGB文件

matplotlib

注意讀入的圖片的格式：

.jpg格式->uint8~~~~~~~~~~~~~~~~.png格式->float32

import matplotlib.image as mpimg  # mpimg 用于讀取圖片
a = mpimg.imread(r'C:\Users\Administrator\Desktop\real.jpg')

from torchvision.transforms import ToPILImage
show = ToPILImage() # 可以把Tensor轉(zhuǎn)成Image，方便可視化
show(im).show()
# 這個地方可以用這個show函數(shù)來顯示圖片

PIL

對圖像內(nèi)容進(jìn)行操作的函數(shù)，不建議用來讀取圖片。

from PIL import Image
im = Image.open(r'C:\Users\Administrator\Desktop\real.jpg')
im.show()

cv2（推薦）cv2詳細(xì)介紹

不論什么格式的文件，讀入都是uint8

import cv2
# cv2.imread()接口讀圖像，讀進(jìn)來直接是BGR 格式數(shù)據(jù)格式在 0~255，通道格式為(W,H,C)
img_BGR = cv2.imread(r'C:\Users\Administrator\Desktop\real.jpg')
rgb = cv2.cvtColor(rgb, cv2.COLOR_BGR2RGB) #轉(zhuǎn)成RGB
rgb = np.transpose(rgb, [2, 0, 1]) # 進(jìn)行tensor操作時需要將維度放到前面

讀取HSI文件

scipy

from scipy.io import loadmat
filenames_hyper = glob.glob(os.path.join(opt.data_path, 'NTIRE2020_Train_Spectral', '*.mat'))
# 返回一個list，存放該目錄下所有.mat格式的文件路徑
for k in range(len(filenames_hyper)):
	mat = loadmat(filenames_hyper[k])
	hyper = np.float32(np.array(mat['cube']))

h5py（有時候會出問題…）

import h5py
filenames_hyper = glob.glob(os.path.join(opt.data_path, 'NTIRE2020_Train_Spectral', '*.mat'))
for k in range(len(filenames_hyper)):
	mat = h5py.File(filenames_hyper[k], 'r')

在顯示圖片之前需要注意的幾個問題

矩陣的shape：

一般情況下是[ 行數(shù)，列數(shù)，維數(shù) ]（如[ 482, 512, 3 ]），這樣顯示出來會感覺不自然，但確實就是這樣，RGB文件讀入時一般也是這樣。
使用ToPILImage函數(shù)時要注意，詳見Pytorch顯示一個Tensor類型的圖片數(shù)據(jù)；

數(shù)據(jù)類型是0-1的float型，還是0-255的int型或者uint8型：

只要是浮點數(shù)，就會默認(rèn)是0-1范圍內(nèi)。
只要是整形，就會默認(rèn)取值范圍是2-255。
下面會介紹Tensor和numpy如何進(jìn)行數(shù)據(jù)類型的轉(zhuǎn)換；

注意要操作的矩陣是Tensor類型還是numpy類型

顯示Tensor/numpy的數(shù)據(jù)類型

dtype

a = torch.Tensor(1,2,3)
print(a.dtype)
print(a.numpy().dtype)

結(jié)果：

torch.float32

float32

Tensor進(jìn)行數(shù)據(jù)類型的轉(zhuǎn)換

a = torch.randn(10, 20, 3)

a = a.long()/half()/int()...
# torch.long() 將tensor投射為long類型
# torch.half()將tensor投射為半精度浮點類型
# torch.int()將該tensor投射為int類型
# torch.double()將該tensor投射為double類型
# torch.float()將該tensor投射為float類型
# torch.char()將該tensor投射為char類型
# torch.byte()將該tensor投射為byte類型
# torch.short()將該tensor投射為short類型

# 好像沒有uint8

Numpy進(jìn)行數(shù)據(jù)類型的轉(zhuǎn)換

astype()函數(shù)

a = np.random.randint(0, 255, 300)
# 在0-255（包括0，不包括255）范圍內(nèi)產(chǎn)生300個隨機(jī)整形，是一個行向量哦！
a = a.reshape(10,10,3)
a = a.astype(np.uint8)
# .float/.int/...

NumPy 支持比 Python 更多種類的數(shù)值類型。

下表顯示了 NumPy 中定義的不同標(biāo)量數(shù)據(jù)類型。

序號	數(shù)據(jù)類型	及描述
1.	bool	存儲為一個字節(jié)的布爾值(真或假)
2.	int	默認(rèn)整數(shù)，相當(dāng)于 C 的long，通常為int32或int64
3.	int16	16 位整數(shù)(-32768 ~ 32767)
4.	int32	32位整數(shù)(-32768 ~ 32767)
5.	uint8	8 位無符號整數(shù)(0 ~ 255)
6.	float16	半精度浮點：符號位，5 位指數(shù)，10 位尾數(shù)
6.	float32	單精度浮點：符號位，8 位指數(shù)，23 位尾數(shù)
7.	float64	雙精度浮點：符號位，11 位指數(shù)，52 位尾數(shù)

顯示圖片

plt（）

from matplotlib import pyplot as plt
import numpy as np
a = abs(torch.randn(10,20,3))*100
plt.imshow(a) # 顯示圖片
plt.axis('off') # 不顯示坐標(biāo)軸
plt.show()
# 顯示單通道，也就是熱力圖，也就是說可以用它來顯示HSI.mat文件
plt.imshow(a[:,:,0]) 
plt.imshow(a[:,:,0], cmap='Greys_r') #顯示單通道黑白圖
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
plt.subplot(2,2,1) #與matlab語法很相似
plt.imshow(img_BGR)
plt.axis('off')
plt.title('BGR')
# 使用plt顯示圖片時tensor和numpy都可

ToPILImage()

from torchvision.transforms import ToPILImage
show = ToPILImage() # 可以把Tensor轉(zhuǎn)成Image，方便可視化
import matplotlib.image as mpimg  # mpimg 用于讀取圖片
im = mpimg.imread(r'C:\Users\Administrator\Desktop\real.jpg')
show(im).show()
# **只有兩種情況能用這個show**
 - tensor + 0-1的float + [3,482,512]
 - numpy + uint8 + [482,512,3]

保存RGB圖像

保存 matplotlib 畫出的圖像，相當(dāng)于一個 screencapture。（會有白邊）

plt.savefig('a.png')

cv2

# 要保存的數(shù)據(jù)必須是numpy格式
# 都以uint8格式保存，也就是說如果之前是0-1的float32格式數(shù)據(jù)會全是0
cv2.imwrite(r'C:\Users\Administrator\Desktop\a.jpg',a)
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# 對于JPEG，其表示的是圖像的質(zhì)量，用0 - 100的整數(shù)表示，默認(rèn)95;對于png ,第三個參數(shù)表示的是壓縮級別。默認(rèn)為3.
cv2.imwrite('1.png',img, [int(cv2.IMWRITE_JPEG_QUALITY), 95])
# cv2.IMWRITE_JPEG_QUALITY類型為 long ,必須轉(zhuǎn)換成 int
cv2.imwrite('1.png',img, [int(cv2.IMWRITE_PNG_COMPRESSION), 9])
# cv2.IMWRITE_PNG_COMPRESSION, 從0到9 壓縮級別越高圖像越小。

將 array 保存為圖像

~~好像有點問題之后再改~~~

from scipy import misc
misc.imsave('lena_new_sz.png', lena_new_sz)

直接保存 array（直接保存numpy，而不是以圖片格式保存）

讀取之后還是可以按照前面顯示數(shù)組的方法對圖像進(jìn)行顯示，這種方法完全不會對圖像質(zhì)量造成損失

np.save(r'C:\Users\Administrator\Desktop\a', a) # 會在保存的名字后面自動加上.npy
img = np.load('lena_new_sz.npy') # 讀取前面保存的數(shù)組

torchvision.utils.save_image

推薦使用

from torchvision.utils import save_image
dir1 = 'C:/Users/Administrator/Desktop/noise.png'
a = torch.randn(3,400,500)    #注意該tensor的形狀
show(a).show()
save_image(a,dir1)
# 還可以用該函數(shù)生成雪碧圖（許多小圖拼接成一幅大圖）
save_image(torch.stack(image), nrow=8, padding=2, normalize=True, range=(-1, 1))
# 給定 4D mini-batch Tensor，形狀為 (B x C x H x W),或者一個a list of image，做成一個size為(B / nrow, nrow)，每幅圖之間間隔（黑條）是padding的雪碧圖。
# 其中從第三個參數(shù)開始為函數(shù)make_grid()的參數(shù)，主要用于生成雪碧圖。normalize=True ，會將圖片的像素值歸一化處理；range=(min, max)， min和max是數(shù)字，那么min，max用來規(guī)范化image
# 所以這個時候需要使用torch.stack()函數(shù)將許多圖拼接起來（3維->4維）
# 這里再說下和torch.cat()函數(shù)的區(qū)別，cat是沿著第0個維度進(jìn)行拼接，并不會增加維度
a = torch.randn(3,400,500)
b = torch.randn(3,400,500)
print(torch.stack((a,b)).shape) #注意拼接時需要時元組或者列表，所以需要加個()/[]
print(torch.cat((a,b)).shape)
'''
結(jié)果：
torch.Size([2, 3, 400, 500])
torch.Size([6, 400, 500])
'''