Bootstrap

常用图像分类、目标检测模型性能测试

说明

测试常用CV模型在单张图像上的识别速度,不包含图像读取时间,但包含图像预处理。可以在以后的应用中根据硬件配置选取合适的模型,达到最佳效果。其中推理速度为正常推理的速度,加速CPU使用openvino加速,GPU使用tensorrt加速。

CPU硬件: Intel i7 11700 16GB

GPU硬件: Nvidia rtx 3090 24GB

测试代码地址:https://github.com/lining808/cv_time_speed

CPU

目标检测

其中推理速度单位为秒,测试十张图像取平均值。mAP准确率在COCO数据集得到。旋转目标检测mAP在DOTAv1数据集得到。

模型推理速度加速mAP@50-95
yolov5nu0.1170.02334.3
yolov5su0.2620.04743.0
yolov5mu0.5850.0949.0
yolov5lu1.1650.17252.2
yolov8n0.1280.02437.3
yolov8s0.3230.05344.9
yolov8m0.6480.10850.2
yolov8l1.2520.23652.9
yolov9n0.1770.02938.3
yolov9s0.3720.0546.8
yolov9m0.8860.11551.4
yolov9l1.2390.14853.0
yolov10n0.1720.04338.5
yolov10s0.3650.07546.3
yolov10m0.8180.13851.1
yolov10l1.3740.24253.2
rtdetr-l1.2610.18253.0
rtdetr-x2.2320.32154.8
yolov8n-obb0.3110.05178.0
yolov8s-obb0.7170.15779.5
yolov8m-obb1.6350.27980.5
yolov8l-obb3.1391.12780.7

图像分类

其中推理速度单位为秒,测试十张图像取平均值。Top-1准确率在ImageNet数据集得到。

模型推理速度加速Top-1
yolov8n-cls0.0170.00569.0
yolov8s-cls0.0370.00773.8
yolov8m-cls0.0760.01176.8
yolov8l-cls0.1460.02976.8
yolov8x-cls0.2579.0
resnet180.30672.1
resnet340.41875.5
resnet500.90377.2
resnet1011.61478.3
mobilenet_v3_small0.09367.4
mobilenet_v3_large0.25275.2
efficientnet_v2_s0.98883.9
efficientnet_v2_m1.68485.1
swin_v2_t1.41281.6
swin_v2_b4.07484.1
convnext_tiny0.76682.9
convnext_base2.36385.8

GPU

目标检测

其中推理速度单位为秒,测试十张图像取平均值。mAP准确率在COCO数据集得到。旋转目标检测mAP在DOTAv1数据集得到。

模型推理速度加速mAP@50-95
yolov5nu0.0270.00834.3
yolov5su0.0280.00743.0
yolov5mu0.030.00949.0
yolov5lu0.0320.01552.2
yolov8n0.0250.00737.3
yolov8s0.0230.00844.9
yolov8m0.0260.01150.2
yolov8l0.0260.01552.9
yolov9n0.0330.00838.3
yolov9s0.0320.00846.8
yolov9m0.0380.01251.4
yolov9l0.0260.01353.0
yolov10n0.0180.00638.5
yolov10s0.0190.00746.3
yolov10m0.0250.00951.1
yolov10l0.0240.01353.2
rtdetr-l0.0453.0
rtdetr-x0.04554.8
yolov8n-obb0.0470.00678.0
yolov8s-obb0.030.00879.5
yolov8m-obb0.0390.01480.5
yolov8l-obb0.0410.02380.7

图像分类

其中推理速度单位为秒,测试十张图像取平均值。Top-1准确率在ImageNet数据集得到。

模型推理速度加速Top-1
yolov8n-cls0.0120.02169.0
yolov8s-cls0.0120.0273.8
yolov8m-cls0.0130.02776.8
yolov8l-cls0.0140.02976.8
yolov8x-cls0.0160.0379.0
resnet180.04272.1
resnet340.04675.5
resnet500.05577.2
resnet1010.06378.3
mobilenet_v3_small0.05467.4
mobilenet_v3_large0.05675.2
efficientnet_v2_s0.07483.9
efficientnet_v2_m0.07685.1
swin_v2_t0.12781.6
swin_v2_b0.14584.1
convnext_tiny0.04882.9
convnext_base0.06885.8

结论

总体来说YOLO不论是分类还是目标检测,基本上做到了速度和精度的均衡。
openvino加速可以比pt推理快6倍左右,但需要CPU是英特尔平台并且有集成显卡。精度有一定程度下降,平均下降2-3%。onnx推理精度几乎保持不变,速度提升约3倍。
tensorrt加速可以比pt推理快3倍左右,需要GPU为英伟达平台。精度基本保持不变,下降在1%内。

推荐模型

图像分类

速度均衡精度
CPUyolov8n-clsyolov8m-clsefficientnet_v2_m
GPUyolov8n-clsyolov8m-clsconvnext_base

目标检测

速度均衡精度
CPUyolov8nyolov8myolov9l
GPUyolov10nyolov10myolov10l

推理格式

CPU推理有集显使用openvino,无集显使用onnx。

GPU推理使用tensorrt

;