文章目录
不要使用官网版本,直接使用conda版本,有对应的包,安装很方便,
各历史版本的Anaconda:https://repo.anaconda.com/archive/
1. 查看本机GPU的cuda版本
在命令行输入nvidia-smi
,显示CUDA 版本为12.3 ,cuda(cudatoolkit)版本低于或等于12.3均可。
上图红框内容分别为:显存使用/显存大小,右侧为GPU使用率。
2. 查看适配gpu、tensorflow-gpu、cuda、cudnn版本
下面这些版本已经测试过,可以匹配使用。CUDA版本只受显卡驱动版本的影响,版本越新支持的CUDA约多,且向下兼容。
Version | Python version | Compiler Build | tools | cuDNN | CUDA |
---|---|---|---|---|---|
tensorflow_gpu-2.10.0 | 3.7-3.10 | MSVC 2019 | Bazel 5.1.1 | 8.1 | 11.2 |
tensorflow_gpu-2.9.0 | 3.7-3.10 | MSVC 2019 | Bazel 5.0.0 | 8.1 | 11.2 |
tensorflow_gpu-2.8.0 | 3.7-3.10 | MSVC 2019 | Bazel 4.2.1 | 8.1 | 11.2 |
tensorflow_gpu-2.7.0 | 3.7-3.9 | MSVC 2019 | Bazel 3.7.2 | 8.1 | 11.2 |
tensorflow_gpu-2.6.0 | 3.6-3.9 | MSVC 2019 | Bazel 3.7.2 | 8.1 | 11.2 |
tensorflow_gpu-2.5.0 | 3.6-3.9 | MSVC 2019 | Bazel 3.7.2 | 8.1 | 11.2 |
tensorflow_gpu-2.4.0 | 3.6-3.8 | MSVC 2019 | Bazel 3.1.0 | 8.0 | 11.0 |
tensorflow_gpu-2.3.0 | 3.5-3.8 | MSVC 2019 | Bazel 3.1.0 | 7.6 | 10.1 |
tensorflow_gpu-2.2.0 | 3.5-3.8 | MSVC 2019 | Bazel 2.0.0 | 7.6 | 10.1 |
tensorflow_gpu-2.1.0 | 3.5-3.7 | MSVC 2019 | Bazel 0.27.1-0.29.1 | 7.6 | 10.1 |
tensorflow_gpu-2.0.0 | 3.5-3.7 | MSVC 2017 | Bazel 0.26.1 | 7.4 | 10 |
tensorflow_gpu-1.15.0 | 3.5-3.7 | MSVC 2017 | Bazel 0.26.1 | 7.4 | 10 |
tensorflow_gpu-1.14.0 | 3.5-3.7 | MSVC 2017 | Bazel 0.24.1-0.25.2 | 7.4 | 10 |
tensorflow_gpu-1.13.0 | 3.5-3.7 | MSVC 2015 | update 3 Bazel 0.19.0-0.21.0 | 7.4 | 10 |
tensorflow_gpu-1.12.0 | 3.5-3.6 | MSVC 2015 | update 3 Bazel 0.15.0 | 7.2 | 9.0 |
tensorflow_gpu-1.11.0 | 3.5-3.6 | MSVC 2015 | update 3 Bazel 0.15.0 | 7 | 9 |
tensorflow_gpu-1.10.0 | 3.5-3.6 | MSVC 2015 | update 3 Cmake v3.6.3 | 7 | 9 |
tensorflow_gpu-1.9.0 | 3.5-3.6 | MSVC 2015 | update 3 Cmake v3.6.3 | 7 | 9 |
tensorflow_gpu-1.8.0 | 3.5-3.6 | MSVC 2015 | update 3 Cmake v3.6.3 | 7 | 9 |
tensorflow_gpu-1.7.0 | 3.5-3.6 | MSVC 2015 | update 3 Cmake v3.6.3 | 7 | 9 |
tensorflow_gpu-1.6.0 | 3.5-3.6 | MSVC 2015 | update 3 Cmake v3.6.3 | 7 | 9 |
tensorflow_gpu-1.5.0 | 3.5-3.6 | MSVC 2015 | update 3 Cmake v3.6.3 | 7 | 9 |
tensorflow_gpu-1.4.0 | 3.5-3.6 | MSVC 2015 | update 3 Cmake v3.6.3 | 6 | 8 |
tensorflow_gpu-1.3.0 | 3.5-3.6 | MSVC 2015 | update 3 Cmake v3.6.3 | 6 | 8 |
tensorflow_gpu-1.2.0 | 3.5-3.6 | MSVC 2015 | update 3 Cmake v3.6.3 | 5.1 | 8 |
tensorflow_gpu-1.1.0 | 3.5 | MSVC 2015 | update 3 Cmake v3.6.3 | 5.1 | 8 |
tensorflow_gpu-1.0.0 | 3.5 | MSVC 2015 | update 3 Cmake v3.6.3 | 5.1 | 8 |
找到NVIDIA控制面板->帮助->系统信息->组件 看一下CUDA版本,我的12.3是目前最新的,一般向下兼容
为了高效下载,建议先把默认源换了,很简单这里不再赘述。(我用梯子,所以没换源😋)
conda search cuda
如果您想安装 CUDA Toolkit 的特定版本(在这种情况下是 8.2.1),并且您已经知道了版本号和相应的 channel,您可以使用 conda install
命令并指定版本号和 channel。根据您提供的截图,这里的 channel 是 anaconda/pkgs/main
。
下面是您可以用来安装 CUDA Toolkit 8.2.1 的命令行:
(tf2) C:\Users\hello>conda install cudnn=8.2.1 -c anaconda
在这条命令中,-c anaconda
指定了从 anaconda
channel 安装包,cudnn=8.2.1
精确指定了要安装的软件包及其版本号。
请注意,在进行安装之前,最好确认 CUDA Toolkit 的这个版本是否与您的系统和 NVIDIA 驱动兼容。同时,如果您使用的是特定的深度学习框架,如 TensorFlow 或 PyTorch,您还应该检查框架所支持的 CUDA 版本,以确保一切可以正常工作。
3. Anaconda+python虚拟环境
cudnn7.6.0 + cuda10.1.168 +tensorflow-gpu2.3.0
如果你需要用到tensorflow了那我相信你一定会用Anaconda,Anaconda的安装不再赘述。只是提个醒,如果你第一次用conda create -n创建环境那么路径一定在C盘,而换默认路径一定是可以设置的,这里也不再展开
创建TensorFlow环境:(tf是环境名字,尽量取短点吧,要不然以后手都输麻)
conda create -n tf python=3.9
4. 安装CUDA以及cudnn
先试试cudatoolkit11.3.1和cudnn8.2.1
conda install cudatoolkit=11.3.1
conda install cudnn=8.2.1
5. 安装tensorflow-gpu
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple tensorflow-gpu==2.9.0
pip install tensorflow-gpu==2.6.0
6. 测试tensorflow的GPU版本安装成功的办法
import tensorflow as tf
print(tf.__version__)
print(tf.test.gpu_device_name())
print(tf.config.experimental.set_visible_devices)
print('GPU:', tf.config.list_physical_devices('GPU'))
print('CPU:', tf.config.list_physical_devices(device_type='CPU'))
print(tf.config.list_physical_devices('GPU'))
print(tf.test.is_gpu_available())
# 输出可用的GPU数量
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))
# 查询GPU设备
运行结果
PyDev console: starting.
Python 3.9.18 (main, Sep 11 2023, 14:09:26) [MSC v.1916 64 bit (AMD64)] on win32
>>> runfile('C:/Users/hello/PycharmProjects/test-gpu/test012.py', wdir='C:/Users/hello/PycharmProjects/test-gpu')
2.6.0
2024-01-07 20:50:01.228825: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-01-07 20:50:01.880295: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /device:GPU:0 with 21668 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:01:00.0, compute capability: 8.6
/device:GPU:0
<function set_visible_devices at 0x000001377600B820>
GPU: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
CPU: [PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')]
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
WARNING:tensorflow:From C:/Users/hello/PycharmProjects/test-gpu/test012.py:9: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
2024-01-07 20:50:01.881791: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /device:GPU:0 with 21668 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:01:00.0, compute capability: 8.6
True
Num GPUs Available: 1
显示“GPU True”,即代表GPU版本安装成功!
7. 更多版本
anaconda search -t conda cuda
6.1 numpy 版本不匹配
随意安装了 numpy,发现不匹配
C:\Users\hello\.conda\envs\tf2\lib\site-packages\tensorflow\python\framework\dtypes.py:585: FutureWarning: In the future `np.object` will be defined as the corresponding NumPy scalar.
np.object,
Traceback (most recent call last):
File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 850, in exec_module
File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
File "C:\Users\hello\.conda\envs\tf2\lib\site-packages\tensorflow\__init__.py", line 41, in <module>
from tensorflow.python.tools import module_util as _module_util
File "C:\Users\hello\.conda\envs\tf2\lib\site-packages\tensorflow\python\__init__.py", line 46, in <module>
from tensorflow.python import data
File "C:\Users\hello\.conda\envs\tf2\lib\site-packages\tensorflow\python\data\__init__.py", line 25, in <module>
from tensorflow.python.data import experimental
File "C:\Users\hello\.conda\envs\tf2\lib\site-packages\tensorflow\python\data\experimental\__init__.py", line 97, in <module>
from tensorflow.python.data.experimental import service
File "C:\Users\hello\.conda\envs\tf2\lib\site-packages\tensorflow\python\data\experimental\service\__init__.py", line 353, in <module>
from tensorflow.python.data.experimental.ops.data_service_ops import distribute
File "C:\Users\hello\.conda\envs\tf2\lib\site-packages\tensorflow\python\data\experimental\ops\data_service_ops.py", line 26, in <module>
from tensorflow.python.data.experimental.ops import compression_ops
File "C:\Users\hello\.conda\envs\tf2\lib\site-packages\tensorflow\python\data\experimental\ops\compression_ops.py", line 20, in <module>
from tensorflow.python.data.util import structure
File "C:\Users\hello\.conda\envs\tf2\lib\site-packages\tensorflow\python\data\util\structure.py", line 26, in <module>
from tensorflow.python.data.util import nest
File "C:\Users\hello\.conda\envs\tf2\lib\site-packages\tensorflow\python\data\util\nest.py", line 40, in <module>
from tensorflow.python.framework import sparse_tensor as _sparse_tensor
File "C:\Users\hello\.conda\envs\tf2\lib\site-packages\tensorflow\python\framework\sparse_tensor.py", line 28, in <module>
from tensorflow.python.framework import constant_op
File "C:\Users\hello\.conda\envs\tf2\lib\site-packages\tensorflow\python\framework\constant_op.py", line 29, in <module>
from tensorflow.python.eager import execute
File "C:\Users\hello\.conda\envs\tf2\lib\site-packages\tensorflow\python\eager\execute.py", line 27, in <module>
from tensorflow.python.framework import dtypes
File "C:\Users\hello\.conda\envs\tf2\lib\site-packages\tensorflow\python\framework\dtypes.py", line 585, in <module>
np.object,
File "C:\Users\hello\.conda\envs\tf2\lib\site-packages\numpy\__init__.py", line 324, in __getattr__
raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'object'.
`np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
pip uninstall numpy
pip install numpy==版本号
参考文献
[1] Tensorflow与Python、CUDA、cuDNN的版本对应表 2023.10;
[2] Anaconda环境下Tensorflow的安装与卸载 2020.11;
[3] conda 安装指定版本tensorflow cpu/gpu 2019.12;
[4] Tensorflow-gpu保姆级安装教程(Win11, Anaconda3,Python3.9)2023.3;
[5] 在 Windows 环境中从源代码构建;
[6] 十分钟安装Tensorflow-gpu2.6.0+本机CUDA12 以及numpy+matplotlib各包版本协调问题 2023.10;