Bootstrap

Win11极速安装Tensorflow-gpu+CUDA+cudnn


不要使用官网版本,直接使用conda版本,有对应的包,安装很方便,

操作视频链接:https://www.bilibili.com/video/BV1wJ4m1W7ps/?spm_id_from=333.999.0.0&vd_source=b5e395daf1dc59fb72b2633affa96661

各历史版本的Anaconda:https://repo.anaconda.com/archive/

1. 查看本机GPU的cuda版本

在命令行输入nvidia-smi,显示CUDA 版本为12.3 ,cuda(cudatoolkit)版本低于或等于12.3均可。
在这里插入图片描述
上图红框内容分别为:显存使用/显存大小,右侧为GPU使用率

2. 查看适配gpu、tensorflow-gpu、cuda、cudnn版本

下面这些版本已经测试过,可以匹配使用。CUDA版本只受显卡驱动版本的影响,版本越新支持的CUDA约多,且向下兼容。

VersionPython versionCompiler BuildtoolscuDNNCUDA
tensorflow_gpu-2.10.03.7-3.10MSVC 2019Bazel 5.1.18.111.2
tensorflow_gpu-2.9.03.7-3.10MSVC 2019Bazel 5.0.08.111.2
tensorflow_gpu-2.8.03.7-3.10MSVC 2019Bazel 4.2.18.111.2
tensorflow_gpu-2.7.03.7-3.9MSVC 2019Bazel 3.7.28.111.2
tensorflow_gpu-2.6.03.6-3.9MSVC 2019Bazel 3.7.28.111.2
tensorflow_gpu-2.5.03.6-3.9MSVC 2019Bazel 3.7.28.111.2
tensorflow_gpu-2.4.03.6-3.8MSVC 2019Bazel 3.1.08.011.0
tensorflow_gpu-2.3.03.5-3.8MSVC 2019Bazel 3.1.07.610.1
tensorflow_gpu-2.2.03.5-3.8MSVC 2019Bazel 2.0.07.610.1
tensorflow_gpu-2.1.03.5-3.7MSVC 2019Bazel 0.27.1-0.29.17.610.1
tensorflow_gpu-2.0.03.5-3.7MSVC 2017Bazel 0.26.17.410
tensorflow_gpu-1.15.03.5-3.7MSVC 2017Bazel 0.26.17.410
tensorflow_gpu-1.14.03.5-3.7MSVC 2017Bazel 0.24.1-0.25.27.410
tensorflow_gpu-1.13.03.5-3.7MSVC 2015update 3 Bazel 0.19.0-0.21.07.410
tensorflow_gpu-1.12.03.5-3.6MSVC 2015update 3 Bazel 0.15.07.29.0
tensorflow_gpu-1.11.03.5-3.6MSVC 2015update 3 Bazel 0.15.079
tensorflow_gpu-1.10.03.5-3.6MSVC 2015update 3 Cmake v3.6.379
tensorflow_gpu-1.9.03.5-3.6MSVC 2015update 3 Cmake v3.6.379
tensorflow_gpu-1.8.03.5-3.6MSVC 2015update 3 Cmake v3.6.379
tensorflow_gpu-1.7.03.5-3.6MSVC 2015update 3 Cmake v3.6.379
tensorflow_gpu-1.6.03.5-3.6MSVC 2015update 3 Cmake v3.6.379
tensorflow_gpu-1.5.03.5-3.6MSVC 2015update 3 Cmake v3.6.379
tensorflow_gpu-1.4.03.5-3.6MSVC 2015update 3 Cmake v3.6.368
tensorflow_gpu-1.3.03.5-3.6MSVC 2015update 3 Cmake v3.6.368
tensorflow_gpu-1.2.03.5-3.6MSVC 2015update 3 Cmake v3.6.35.18
tensorflow_gpu-1.1.03.5MSVC 2015update 3 Cmake v3.6.35.18
tensorflow_gpu-1.0.03.5MSVC 2015update 3 Cmake v3.6.35.18

找到NVIDIA控制面板->帮助->系统信息->组件 看一下CUDA版本,我的12.3是目前最新的,一般向下兼容

为了高效下载,建议先把默认源换了,很简单这里不再赘述。(我用梯子,所以没换源😋)

conda search cuda

在这里插入图片描述
在这里插入图片描述
如果您想安装 CUDA Toolkit 的特定版本(在这种情况下是 8.2.1),并且您已经知道了版本号和相应的 channel,您可以使用 conda install 命令并指定版本号和 channel。根据您提供的截图,这里的 channel 是 anaconda/pkgs/main

下面是您可以用来安装 CUDA Toolkit 8.2.1 的命令行:

(tf2) C:\Users\hello>conda install cudnn=8.2.1 -c anaconda

在这条命令中,-c anaconda 指定了从 anaconda channel 安装包,cudnn=8.2.1 精确指定了要安装的软件包及其版本号。

请注意,在进行安装之前,最好确认 CUDA Toolkit 的这个版本是否与您的系统和 NVIDIA 驱动兼容。同时,如果您使用的是特定的深度学习框架,如 TensorFlow 或 PyTorch,您还应该检查框架所支持的 CUDA 版本,以确保一切可以正常工作。

3. Anaconda+python虚拟环境

cudnn7.6.0 + cuda10.1.168 +tensorflow-gpu2.3.0
如果你需要用到tensorflow了那我相信你一定会用Anaconda,Anaconda的安装不再赘述。只是提个醒,如果你第一次用conda create -n创建环境那么路径一定在C盘,而换默认路径一定是可以设置的,这里也不再展开
创建TensorFlow环境:(tf是环境名字,尽量取短点吧,要不然以后手都输麻)

conda create -n tf python=3.9

4. 安装CUDA以及cudnn

先试试cudatoolkit11.3.1和cudnn8.2.1

conda install cudatoolkit=11.3.1
conda install cudnn=8.2.1

5. 安装tensorflow-gpu

pip install -i https://pypi.tuna.tsinghua.edu.cn/simple tensorflow-gpu==2.9.0
pip install tensorflow-gpu==2.6.0

6. 测试tensorflow的GPU版本安装成功的办法

import tensorflow as tf

print(tf.__version__)
print(tf.test.gpu_device_name())
print(tf.config.experimental.set_visible_devices)
print('GPU:', tf.config.list_physical_devices('GPU'))
print('CPU:', tf.config.list_physical_devices(device_type='CPU'))
print(tf.config.list_physical_devices('GPU'))
print(tf.test.is_gpu_available())
# 输出可用的GPU数量
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))
# 查询GPU设备

运行结果

PyDev console: starting.
Python 3.9.18 (main, Sep 11 2023, 14:09:26) [MSC v.1916 64 bit (AMD64)] on win32
>>> runfile('C:/Users/hello/PycharmProjects/test-gpu/test012.py', wdir='C:/Users/hello/PycharmProjects/test-gpu')
2.6.0
2024-01-07 20:50:01.228825: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-01-07 20:50:01.880295: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /device:GPU:0 with 21668 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:01:00.0, compute capability: 8.6
/device:GPU:0
<function set_visible_devices at 0x000001377600B820>
GPU: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
CPU: [PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')]
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
WARNING:tensorflow:From C:/Users/hello/PycharmProjects/test-gpu/test012.py:9: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
2024-01-07 20:50:01.881791: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /device:GPU:0 with 21668 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:01:00.0, compute capability: 8.6
True
Num GPUs Available:  1

显示“GPU True”,即代表GPU版本安装成功!

7. 更多版本

anaconda search -t conda cuda

在这里插入图片描述
在这里插入图片描述

6.1 numpy 版本不匹配

随意安装了 numpy,发现不匹配

C:\Users\hello\.conda\envs\tf2\lib\site-packages\tensorflow\python\framework\dtypes.py:585: FutureWarning: In the future `np.object` will be defined as the corresponding NumPy scalar.
  np.object,
Traceback (most recent call last):
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 850, in exec_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "C:\Users\hello\.conda\envs\tf2\lib\site-packages\tensorflow\__init__.py", line 41, in <module>
    from tensorflow.python.tools import module_util as _module_util
  File "C:\Users\hello\.conda\envs\tf2\lib\site-packages\tensorflow\python\__init__.py", line 46, in <module>
    from tensorflow.python import data
  File "C:\Users\hello\.conda\envs\tf2\lib\site-packages\tensorflow\python\data\__init__.py", line 25, in <module>
    from tensorflow.python.data import experimental
  File "C:\Users\hello\.conda\envs\tf2\lib\site-packages\tensorflow\python\data\experimental\__init__.py", line 97, in <module>
    from tensorflow.python.data.experimental import service
  File "C:\Users\hello\.conda\envs\tf2\lib\site-packages\tensorflow\python\data\experimental\service\__init__.py", line 353, in <module>
    from tensorflow.python.data.experimental.ops.data_service_ops import distribute
  File "C:\Users\hello\.conda\envs\tf2\lib\site-packages\tensorflow\python\data\experimental\ops\data_service_ops.py", line 26, in <module>
    from tensorflow.python.data.experimental.ops import compression_ops
  File "C:\Users\hello\.conda\envs\tf2\lib\site-packages\tensorflow\python\data\experimental\ops\compression_ops.py", line 20, in <module>
    from tensorflow.python.data.util import structure
  File "C:\Users\hello\.conda\envs\tf2\lib\site-packages\tensorflow\python\data\util\structure.py", line 26, in <module>
    from tensorflow.python.data.util import nest
  File "C:\Users\hello\.conda\envs\tf2\lib\site-packages\tensorflow\python\data\util\nest.py", line 40, in <module>
    from tensorflow.python.framework import sparse_tensor as _sparse_tensor
  File "C:\Users\hello\.conda\envs\tf2\lib\site-packages\tensorflow\python\framework\sparse_tensor.py", line 28, in <module>
    from tensorflow.python.framework import constant_op
  File "C:\Users\hello\.conda\envs\tf2\lib\site-packages\tensorflow\python\framework\constant_op.py", line 29, in <module>
    from tensorflow.python.eager import execute
  File "C:\Users\hello\.conda\envs\tf2\lib\site-packages\tensorflow\python\eager\execute.py", line 27, in <module>
    from tensorflow.python.framework import dtypes
  File "C:\Users\hello\.conda\envs\tf2\lib\site-packages\tensorflow\python\framework\dtypes.py", line 585, in <module>
    np.object,
  File "C:\Users\hello\.conda\envs\tf2\lib\site-packages\numpy\__init__.py", line 324, in __getattr__
    raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'object'.
`np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe. 
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
pip uninstall numpy
pip install numpy==版本号

参考文献

[1] Tensorflow与Python、CUDA、cuDNN的版本对应表 2023.10;
[2] Anaconda环境下Tensorflow的安装与卸载 2020.11;
[3] conda 安装指定版本tensorflow cpu/gpu 2019.12;
[4] Tensorflow-gpu保姆级安装教程(Win11, Anaconda3,Python3.9)2023.3;
[5] 在 Windows 环境中从源代码构建
[6] 十分钟安装Tensorflow-gpu2.6.0+本机CUDA12 以及numpy+matplotlib各包版本协调问题 2023.10;

;