快捷導(dǎo)航

Pytorch之上/下采樣函數(shù)torch.nn.functional.interpolate插值詳解

更新時間：2025年04月16日 09:31:12 作者：Yuezero_

這篇文章主要介紹了Pytorch之上/下采樣函數(shù)torch.nn.functional.interpolate插值,具有很好的參考價值,希望對大家有所幫助,如有錯誤或未考慮完全的地方,望不吝賜教

Pytorch上/下采樣函數(shù)torch.nn.functional.interpolate插值

torch.nn.functional.interpolate(input_tensor, size=None, scale_factor=8, mode='bilinear', align_corners=False)
'''
Down/up samples the input to either the given size or the given scale_factor
The algorithm used for interpolation is determined by mode.
Currently temporal, spatial and volumetric sampling are supported, i.e. expected inputs are 3-D, 4-D or 5-D in shape.
The input dimensions are interpreted in the form: mini-batch x channels x [optional depth] x [optional height] x width.
The modes available for resizing are: nearest, linear (3D-only), bilinear, bicubic (4D-only), trilinear (5D-only), area
'''

這個函數(shù)是用來上采樣或下采樣tensor的空間維度(h,w)：

input_tensor支持輸入3D (b, c, w)或(batch,seq_len,dim)、4D (b, c, h, w)、5D (b, c, f, h, w)的 tensor shape。其中b表示batch_size，c表示channel，f表示frames，h表示height，w表示weight。

size是目標(biāo)tensor的(w)/(h,w)/(f,h,w)的形狀；scale_factor是采樣tensor的saptial shape(w)/(h,w)/(f,h,w)的縮放系數(shù)，size和scale_factor兩個參數(shù)只能定義一個，具體是上采樣，還是下采樣根據(jù)這兩個參數(shù)判斷。如果size或者scale_factor是list序列，則必須匹配輸入的大小。

如果輸入3D，則它們的序列長度必須是1（只縮放最后1個維度w）。
如果輸入4D，則它們的序列長度必須是2（縮放最后2個維度h,w）。
如果輸入是5D，則它們的序列長度必須是3（縮放最后3個維度f,h,w）。

插值算法mode可選：最近鄰(nearest, 默認(rèn))、線性(linear, 3D-only)、雙線性(bilinear, 4D-only)、三線性(trilinear, 5D-only)等等。

是否align_corners對齊角點：可選的bool值，如果 align_corners=True，則對齊 input 和 output 的角點像素(corner pixels)，保持在角點像素的值. 只會對 mode=linear, bilinear, trilinear 有作用. 默認(rèn)是 False。一圖看懂align_corners=True與False的區(qū)別，從4×4上采樣成8×8。

一個是按四角的像素點中心對齊，另一個是按四角的像素角點對齊：

import torch
import torch.nn.functional as F
b, c, f, h, w = 1, 3, 8, 64, 64

1. upsample/downsample 3D tensor

# interpolate 3D tensor
x = torch.randn([b, c, w])
## downsample to (b, c, w/2)
y0 = F.interpolate(x, scale_factor=0.5, mode='nearest')
y1 = F.interpolate(x, size=[w//2], mode='nearest')
y2 = F.interpolate(x, scale_factor=0.5, mode='linear')  # only 3D
y3 = F.interpolate(x, size=[w//2], mode='linear')  # only 3D
print(y0.shape, y1.shape, y2.shape, y3.shape)
# torch.Size([1, 3, 32]) torch.Size([1, 3, 32]) torch.Size([1, 3, 32]) torch.Size([1, 3, 32])

## upsample to (b, c, w*2)
y0 = F.interpolate(x, scale_factor=2, mode='nearest')
y1 = F.interpolate(x, size=[w*2], mode='nearest')
y2 = F.interpolate(x, scale_factor=2, mode='linear')  # only 3D
y3 = F.interpolate(x, size=[w*2], mode='linear')  # only 3D
print(y0.shape, y1.shape, y2.shape, y3.shape)
# torch.Size([1, 3, 128]) torch.Size([1, 3, 128]) torch.Size([1, 3, 128]) torch.Size([1, 3, 128])

2. upsample/downsample 4D tensor

# interpolate 4D tensor
x = torch.randn(b, c, h, w)
## downsample to (b, c, h/2, w/2)
y0 = F.interpolate(x, scale_factor=0.5, mode='nearest')
y1 = F.interpolate(x, size=[h//2, w//2], mode='nearest')
y2 = F.interpolate(x, scale_factor=0.5, mode='bilinear')  # only 4D
y3 = F.interpolate(x, size=[h//2, w//2], mode='bilinear')  # only 4D
print(y0.shape, y1.shape, y2.shape, y3.shape)
# torch.Size([1, 3, 32, 32]) torch.Size([1, 3, 32, 32]) torch.Size([1, 3, 32, 32]) torch.Size([1, 3, 32, 32])

## upsample to (b, c, h*2, w*2)
y0 = F.interpolate(x, scale_factor=2, mode='nearest')
y1 = F.interpolate(x, size=[h*2, w*2], mode='nearest')
y2 = F.interpolate(x, scale_factor=2, mode='bilinear')  # only 4D
y3 = F.interpolate(x, size=[h*2, w*2], mode='bilinear')  # only 4D
print(y0.shape, y1.shape, y2.shape, y3.shape)
# torch.Size([1, 3, 128, 128]) torch.Size([1, 3, 128, 128]) torch.Size([1, 3, 128, 128]) torch.Size([1, 3, 128, 128])

3. upsample/downsample 5D tensor

# interpolate 5D tensor
x = torch.randn(b, c, f, h, w)
## downsample to (b, c, f/2, h/2, w/2)
y0 = F.interpolate(x, scale_factor=0.5, mode='nearest')
y1 = F.interpolate(x, size=[f//2, h//2, w//2], mode='nearest')
y2 = F.interpolate(x, scale_factor=2, mode='trilinear')  # only 5D
y3 = F.interpolate(x, size=[f//2, h//2, w//2], mode='trilinear')  # only 5D
print(y0.shape, y1.shape, y2.shape, y3.shape)
# torch.Size([1, 3, 4, 32, 32]) torch.Size([1, 3, 4, 32, 32]) torch.Size([1, 3, 16, 128, 128]) torch.Size([1, 3, 4, 32, 32])

## upsample to (b, c, f*2, h*2, w*2)
y0 = F.interpolate(x, scale_factor=2, mode='nearest')
y1 = F.interpolate(x, size=[f*2, h*2, w*2], mode='nearest')
y2 = F.interpolate(x, scale_factor=2, mode='trilinear')  # only 5D
y3 = F.interpolate(x, size=[f*2, h*2, w*2], mode='trilinear')  # only 5D
print(y0.shape, y1.shape, y2.shape, y3.shape)
# torch.Size([1, 3, 16, 128, 128]) torch.Size([1, 3, 16, 128, 128]) torch.Size([1, 3, 16, 128, 128]) torch.Size([1, 3, 16, 128, 128])