pytorch遇见RuntimeError: CUDA out of memory的解决

RuntimeError: CUDA out of memory

1.查看是否其他程序占用显存

遇到此类错误后，对于py格式的文件来说，程序会进行终止，也就是当前程序占用的显存将会被释放。此时可用 watch -n 1 nvidia-smi 命令查看当前显存的使用情况。如果此时显存依然有比较大的占用，说明存在其他程序占用显存，使用kill命令结束不必要的程序即可。
在这里插入图片描述

2.查看pytorch和cuda是否匹配

使用方法torch.cuda.is_available()，确认pytorch和cuda是否匹配，如果返回False，需要调整一下cuda或torch的版本，重新部署一下环境。

3.cuda.empty_cache()方法

如果使用的是jupyter notebook，遇到错误时显存并不会被释放，笔者在网上查到的方法是可用torch.cuda.empty_cache()删除一些不需要的变量，并且测试代码之前使用with torch.no_grad()。但是我尝试过后发现torch.cuda.empty_cache()貌似并不能解决notebook中的显存占用。查询官网API，官方的说明是：

Releases all unoccupied cached memory currently held by the caching allocator

https://pytorch.org/docs/stable/cuda.html?highlight=empty_cache#torch.cuda.empty_cache