解決TensorFlow程序無限制占用GPU的方法
今天遇到一個奇怪的現(xiàn)象,使用tensorflow-gpu的時候,出現(xiàn)內(nèi)存超額~~如果我訓(xùn)練什么大型數(shù)據(jù)也就算了,關(guān)鍵我就寫了一個y=W*x…顯示如下圖所示:
程序如下:
import tensorflow as tf w = tf.Variable([[1.0,2.0]]) b = tf.Variable([[2.],[3.]]) y = tf.multiply(w,b) init_op = tf.global_variables_initializer() with tf.Session() as sess: sess.run(init_op) print(sess.run(y))
出錯提示:
占用的內(nèi)存越來越多,程序崩潰之后,整個電腦都奔潰了,因為整個顯卡全被吃了
2018-06-10 18:28:00.263424: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 2018-06-10 18:28:00.598075: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1356] Found device 0 with properties: name: GeForce GTX 1060 major: 6 minor: 1 memoryClockRate(GHz): 1.6705 pciBusID: 0000:01:00.0 totalMemory: 6.00GiB freeMemory: 4.97GiB 2018-06-10 18:28:00.598453: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1435] Adding visible gpu devices: 0 2018-06-10 18:28:01.265600: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix: 2018-06-10 18:28:01.265826: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:929] 0 2018-06-10 18:28:01.265971: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:942] 0: N 2018-06-10 18:28:01.266220: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4740 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060, pci bus id: 0000:01:00.0, compute capability: 6.1) 2018-06-10 18:28:01.331056: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 4.63G (4970853120 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY 2018-06-10 18:28:01.399111: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 4.17G (4473767936 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY 2018-06-10 18:28:01.468293: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 3.75G (4026391040 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY 2018-06-10 18:28:01.533138: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 3.37G (3623751936 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY 2018-06-10 18:28:01.602452: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 3.04G (3261376768 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY 2018-06-10 18:28:01.670225: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 2.73G (2935238912 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY 2018-06-10 18:28:01.733120: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 2.46G (2641714944 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY 2018-06-10 18:28:01.800101: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 2.21G (2377543424 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY 2018-06-10 18:28:01.862064: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.99G (2139789056 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY 2018-06-10 18:28:01.925434: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.79G (1925810176 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY 2018-06-10 18:28:01.986180: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.61G (1733229056 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY 2018-06-10 18:28:02.043456: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.45G (1559906048 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY 2018-06-10 18:28:02.103531: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.31G (1403915520 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY 2018-06-10 18:28:02.168973: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.18G (1263524096 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY 2018-06-10 18:28:02.229387: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.06G (1137171712 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY 2018-06-10 18:28:02.292997: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 976.04M (1023454720 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY 2018-06-10 18:28:02.356714: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 878.44M (921109248 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY 2018-06-10 18:28:02.418167: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 790.59M (828998400 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY 2018-06-10 18:28:02.482394: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 711.54M (746098688 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
分析原因:
顯卡驅(qū)動不是最新版本,用__驅(qū)動軟件__更新一下驅(qū)動,或者自己去下載更新。
TF運行太多,注銷全部程序沖洗打開。
由于TF內(nèi)核編寫的原因,默認(rèn)占用全部的GPU去訓(xùn)練自己的東西,也就是像meiguo一樣優(yōu)先政策吧
這個時候我們得設(shè)置兩個方面:
- 選擇什么樣的占用方式?優(yōu)先占用__還是__按需占用
- 選擇最大占用多少GPU,因為占用過大GPU會導(dǎo)致其它程序奔潰。最好在0.7以下
先更新驅(qū)動:
再設(shè)置TF程序:
注意:單獨設(shè)置一個不行!按照網(wǎng)上大神博客試了,結(jié)果效果還是很差(占用很多GPU)
設(shè)置TF:
- 按需占用
- 最大占用70%GPU
修改代碼如下:
import tensorflow as tf w = tf.Variable([[1.0,2.0]]) b = tf.Variable([[2.],[3.]]) y = tf.multiply(w,b) init_op = tf.global_variables_initializer() config = tf.ConfigProto(allow_soft_placement=True) gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.7) config.gpu_options.allow_growth = True with tf.Session(config=config) as sess: sess.run(init_op) print(sess.run(y))
成功解決:
2018-06-10 18:21:17.532630: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 2018-06-10 18:21:17.852442: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1356] Found device 0 with properties: name: GeForce GTX 1060 major: 6 minor: 1 memoryClockRate(GHz): 1.6705 pciBusID: 0000:01:00.0 totalMemory: 6.00GiB freeMemory: 4.97GiB 2018-06-10 18:21:17.852817: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1435] Adding visible gpu devices: 0 2018-06-10 18:21:18.511176: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix: 2018-06-10 18:21:18.511397: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:929] 0 2018-06-10 18:21:18.511544: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:942] 0: N 2018-06-10 18:21:18.511815: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4740 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060, pci bus id: 0000:01:00.0, compute capability: 6.1) [[2. 4.] [3. 6.]]
參考資料:
到此這篇關(guān)于解決TensorFlow程序無限制占用GPU的方法 的文章就介紹到這了,更多相關(guān)TensorFlow 占用GPU內(nèi)容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家!
相關(guān)文章
Python深入06——python的內(nèi)存管理詳解
本篇文章主要介紹了python的內(nèi)存管理詳解,語言的內(nèi)存管理是語言設(shè)計的一個重要方面。它是決定語言性能的重要因素。有興趣的同學(xué)可以了解一下。2016-12-12Python數(shù)據(jù)分析之PMI數(shù)據(jù)圖形展示
這篇文章主要介紹了Python數(shù)據(jù)分析之PMI數(shù)據(jù)圖形展示,文章介紹了簡單的python爬蟲,并使用numpy進(jìn)行了簡單的數(shù)據(jù)處理,最終使用?matplotlib?進(jìn)行圖形繪制,實現(xiàn)了直觀的方式展示制造業(yè)和非制造業(yè)指數(shù)圖形,需要的朋友可以參考一下2022-05-05python中 chr unichr ord函數(shù)的實例詳解
這篇文章主要介紹了python中 chr unichr ord函數(shù)的實例詳解的相關(guān)資料,需要的朋友可以參考下2017-08-08