快捷導(dǎo)航

使用PyTorch實(shí)現(xiàn)限制GPU顯存的可使用上限

更新時(shí)間：2024年03月28日 08:37:48 作者：小鋒學(xué)長生活大爆炸

從?PyTorch?1.4?版本開始,引入了一個(gè)新的功能,可以允許用戶為特定的?GPU?設(shè)備設(shè)置進(jìn)程可使用的顯存上限比例,下面我們就來看看具體實(shí)現(xiàn)方法吧

從 PyTorch 1.4 版本開始，引入了一個(gè)新的功能 torch.cuda.set_per_process_memory_fraction(fraction, device)，這個(gè)功能允許用戶為特定的 GPU 設(shè)備設(shè)置進(jìn)程可使用的顯存上限比例。

測試代碼：

torch.cuda.empty_cache()
 
# 設(shè)置進(jìn)程可使用的GPU顯存最大比例為50%
torch.cuda.set_per_process_memory_fraction(0.5, device=0)
 
# 計(jì)算總內(nèi)存
total_memory = torch.cuda.get_device_properties(0).total_memory
print("實(shí)際總內(nèi)存:", round(total_memory / (1024 * 1024), 1), "MB")
 
# 嘗試分配大量顯存的操作
try:
    # 使用10%的顯存:
    tmp_tensor = torch.empty(int(total_memory * 0.1), dtype=torch.int8, device='cuda:0')
    print("分配的內(nèi)存:", round(torch.cuda.memory_allocated(0) / (1024 * 1024), 1), "MB")
    print("保留的內(nèi)存:", round(torch.cuda.memory_reserved(0) / (1024 * 1024), 1), "MB")
    # 清空顯存
    del tmp_tensor
    torch.cuda.empty_cache()
    # 使用50%的顯存:
    torch.empty(int(total_memory * 0.5), dtype=torch.int8, device='cuda:0')
except RuntimeError as e:
    print("Error allocating tensor:", e)
 
# 打印當(dāng)前GPU的顯存使用情況
print("分配的內(nèi)存:", torch.cuda.memory_allocated(0) / (1024 * 1024), "MB")
print("保留的內(nèi)存:", torch.cuda.memory_reserved(0) / (1024 * 1024), "MB")

結(jié)果如下

已分配顯存：通過torch.cuda.memory_allocated(device)查詢，它返回已經(jīng)直接分配給張量的顯存總量。這部分顯存是當(dāng)前正在被Tensor對(duì)象使用的。

保留（預(yù)留）顯存：通過torch.cuda.memory_reserved(device)查詢，它包括了已分配顯存以及一部分由PyTorch的CUDA內(nèi)存分配器為了提高分配效率和減少CUDA操作所需時(shí)間而預(yù)留的顯存。這部分預(yù)留的顯存不直接用于存儲(chǔ)Tensor對(duì)象的數(shù)據(jù)，但可以被視為快速響應(yīng)未來顯存分配請求的“緩沖區(qū)”。

知識(shí)補(bǔ)充

除了上文的方法，小編還為大家整理了一些其他PyTorch限制GPU使用的方法，有需要的可以參考下

限制使用顯存

# 指定之后所有操作在 GPU3 上執(zhí)行
torch.cuda.set_device(3)

# 限制 GPU3 顯存使用50%
desired_memory_fraction = 0.5  # 50% 顯存
torch.cuda.set_per_process_memory_fraction(desired_memory_fraction)

# 獲取當(dāng)前GPU上的總顯存容量
total_memory = torch.cuda.get_device_properties(3).total_memory

# 指定使用 GPU3
tmp_tensor = torch.empty(int(total_memory * 0.4999), dtype=torch.int8, device="cuda") # 此處 cuda 即指 GPU3

# 獲取當(dāng)前已分配的顯存，計(jì)算可用顯存
allocated_memory = torch.cuda.memory_allocated()
available_memory = total_memory - allocated_memory

# 打印結(jié)果
print(f"Total GPU Memory: {total_memory / (1024**3):.2f} GB")
print(f"Allocated GPU Memory: {allocated_memory / (1024**3):.2f} GB")
print(f"Available GPU Memory: {available_memory / (1024**3):.2f} GB")

此時(shí)占用了50%的顯存，而將0.4999改為0.5會(huì)爆顯存，可能是受浮點(diǎn)數(shù)精度影響。

PyTorch限制GPU顯存的函數(shù)與使用

函數(shù)形態(tài)

torch.cuda.set_per_process_memory_fraction(0.5, 0)

參數(shù)1：fraction 限制的上限比例，如0.5 就是總GPU顯存的一半，可以是0~1的任意float大小；

參數(shù)2：device 設(shè)備號(hào)；如0 表示GPU卡 0號(hào)；

使用示例：

import torch
# 限制0號(hào)設(shè)備的顯存的使用量為0.5，就是半張卡那么多，比如12G卡，設(shè)置0.5就是6G。
torch.cuda.set_per_process_memory_fraction(0.5, 0)
torch.cuda.empty_cache()
# 計(jì)算一下總內(nèi)存有多少。
total_memory = torch.cuda.get_device_properties(0).total_memory
# 使用0.499的顯存:
tmp_tensor = torch.empty(int(total_memory * 0.499), dtype=torch.int8, device='cuda')

# 清空該顯存：
del tmp_tensor
torch.cuda.empty_cache()

# 下面這句話會(huì)觸發(fā)顯存OOM錯(cuò)誤，因?yàn)閯偤糜|碰到了上限:
torch.empty(total_memory // 2, dtype=torch.int8, device='cuda')

"""
It raises an error as follows: 
RuntimeError: CUDA out of memory. Tried to allocate 5.59 GiB (GPU 0; 11.17 GiB total capacity; 0 bytes already allocated; 10.91 GiB free; 5.59 GiB allowed; 0 bytes reserved in total by PyTorch)
"""
顯存超標(biāo)后，比不設(shè)置限制的錯(cuò)誤信息多了一個(gè)提示，“5.59 GiB allowed;”

注意事項(xiàng)：

函數(shù)限制的是進(jìn)程的顯存，這點(diǎn)跟TensorFlow的顯存限制類似。

到此這篇關(guān)于使用PyTorch實(shí)現(xiàn)限制GPU顯存的可使用上限的文章就介紹到這了,更多相關(guān)PyTorch限制GPU使用上限內(nèi)容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家！

您可能感興趣的文章: