Bootstrap

已解决 Pytorch GPU设备断言问题RuntimeError: CUDA error: device-side assert triggered CUDA kernel errors mi

import torch

a = torch.randn(size=(2,3)).to('cuda')

print(a)



idx = torch.randperm(3)

print(idx)

print(a[idx])

已解决 Pytorch GPU设备断言问题

RuntimeError: CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

本质是索引idx越界了,这个错误容易发生在分类任务,类别的输出和模型的输出不匹配,例如模型输出类别为8,但是却索引到9的位置,导致报错。运行意思代码是报错的,修改如下:

将randperm参数修改:

import torch

a = torch.randn(size=(2,3)).to('cuda')

print(a)

idx = torch.randperm(2)

print(idx)

print(a[idx])

;